


Next to that, you will pick from Compare as to tell Dedupe.io how to compare values in that column. You’ll be shown a drop down list of column names to pick from. Next, we will identify the fields in each dataset that we want Dedupe.io to pay attention to for finding duplicates. Once I’ve uploaded the data and it has been processed, we’ll move on to the next step. For this dataset, I am going to Add to an existing project and select my Restaurant Matching project from the list.Īdding Restaurants 2 to the Restaurant Matching project When it’s done, I will upload the second dataset: Restaurants 2. If your data is not already de-duplicated, you can follow our intro tutorial on de-duplicating one dataset.

In this example, I am starting a new project and selecting the ‘No duplicates, I want to compare it to another dataset’ radio button. Checking this will skip all the Dedupe.io steps and mark your dataset as de-duplicated in the system. If this is the case, you can select the ‘No duplicates, I want to compare it to another dataset’ radio button. We’re starting a new project here, so I’ll fill in the project name and project description.ĭedupe.io supports uploading datasets that are already de-duplicated (one row for each unique record). Projects are used for grouping multiple datasets together for matching. You will also be prompted to Create a new project or Add to an existing project. Here, you can name your dataset, provide and optional description for it. To start a data linking session, start by uploading a dataset you want to link by clicking on the ‘Upload a new dataset’ button. If you'd like to follow along, you can use our example data for this tutorial:
