>robinlinacre
Filter:
All
data
probabilistic linkage
energy
other
Latest Posts
2024-09-21
Connected components visualisation
2024-09-02
Match weight calculator
2024-01-18
Super-fast deduplication of large datasets using Splink and DuckDB
Data science and engineering
2023-10-19
Thoughts and questions about the short term impact of LLMs on knowledge workers
2023-03-09
Splink and the Open Source Dividend
2023-01-30
SQL should be the default choice for data transformation logic
2023-01-09
Why parquet files are my preferred API for bulk open data
2022-08-05
Splink 3: Fast, accurate and scalable linkage in Python
2021-10-29
The Thorniest Problem of Building an Analytical Platform
2020-11-07
The Downfall of Command and Control Data Leadership
2020-10-22
Demystifying Apache Arrow
2020-04-16
Fuzzy Matching and Deduplicating Hundreds of Millions of Records with Splink
2020-02-22
Why you should open source your analytical work
2019-12-08
Understanding the Spark UI by example: sorting data
2019-12-01
Understanding the Spark UI by example: the Left Join
2019-11-15
Spark UI SQL detailed annotator
2019-10-11
Interactive blogging with Observable Notebooks and gatsby.js
2019-08-26
Effective testing of analytical models using automated sense checks
2019-03-14
Questions Senior Leaders Should Ask Their Data Delivery Teams
2018-08-22
Why I’m backing Vega-Lite as our default tool for data visualisation
2018-08-11
Transforming analytical functions by mainstreaming data science
Probabilistic record linkage
Click here for probabilistic linkage training materials homepage.
2024-09-21
Connected components visualisation
2024-09-02
Match weight calculator
2024-01-18
Super-fast deduplication of large datasets using Splink and DuckDB
2023-10-24
Why Probabilistic Linkage is More Accurate than Fuzzy Matching For Data Deduplication
2023-10-18
Visualising updating a prior
2023-10-02
Computing the Fellegi Sunter model
2023-09-22
m and u values in the Fellegi-Sunter model
2023-09-20
Partial match weights
2023-07-07
The relationship between probabilities, match weights and Bayes factors
2022-10-14
The Intuition Behind the Use of Expectation Maximisation to Train Record Linkage Models
2021-11-15
m and u probability generator with starting values
2021-11-05
Are more complex probabilistic linkage models more accurate? Part 2, unsupervised learning
2021-11-01
Are more complex probabilistic linkage models more accurate? Part 1, supervised learning
2021-06-10
m and u probability generator
2021-06-10
Dependencies between match weights
2021-05-23
Understanding match weights in the Fellegi Sunter model
2021-05-22
Visualising the Fellegi Sunter model
2021-05-21
Maths of Fellegi Sunter (old version)
2021-05-21
The mathematics of the Fellegi Sunter model
2021-05-20
An Interactive Introduction to Record Linkage (Data Deduplication) in the Fellegi-Sunter framework
2019-11-03
Unsupervised probabalistic data matching using the Expectation Maximisation algorithm
Energy and climate change
2021-09-03
The carbon impact of switiching to an electric car
2020-04-17
Comparing energy usage across countries
2020-04-17
Filling the country with solar panels
2019-10-13
Carbon offsetting vs. the cost of renewable energy
2019-10-09
Flight distance calculator
2019-10-05
Energy usage ready reckoner
2019-10-05
My flights
Other
2022-11-11
Why don't you just
2020-04-26
Birdsong quiz
2020-04-25
Birdsong recording finder