Genetic Affairs: User Documentation

Welcome to Genetic Affairs

Genetic Affairs public website front page showing a cluster chart and 'Unleash your DNA matches' tagline
The public site at www.geneticaffairs.com: start here for information and account registration.

Genetic Affairs is a genetic genealogy platform created by Dr Evert-Jan Blom, a Dutch molecular geneticist and bioinformatician. It automates the heavy lifting of DNA-match analysis: downloading your matches, clustering them, pulling trees together to spot common ancestors, and reconstructing partial family trees from the evidence.

The platform runs at two addresses:

What changed recently

Supported testing companies

Company Automated analysis AutoLineage import options
FamilyTreeDNA ✅ The only fully automated company nodes.csv + edges.csv from an automated run; CSV files from DNA Gedcom
MyHeritage AutoCluster embedded at MyHeritage Locally saved HTML files; AutoCluster HTML file; CSV files from DNA Gedcom
Ancestry ❌ (automated retrieval ended 2020) Copy-Paste Wizard; locally saved HTML files; CSV files from DNA Gedcom
GEDmatch AutoCluster & AutoKinship as Tier 1 nodes.csv + edges.csv from an automated run; one-to-many HTML/CSV files
23andMe Locally saved HTML files; AutoCluster HTML file; CSV files from DNA Gedcom
All vendors AutoLineage backup files (matches and shared matches)

The tools, in brief

AutoCluster groups your matches by shared-match relationships. AutoTree finds common ancestors in your matches' trees. AutoSegment clusters by overlapping DNA segments. AutoKinship predicts trees from DNA alone. AutoLineage is the cross-company workbench that blends all of the above.

Getting started

Create an account

  1. Visit members.geneticaffairs.com/register.
  2. Enter name, email and password.
  3. Confirm your email. You land on the members page.

New accounts receive 200 free credits: enough to try an automated AutoCluster or AutoKinship run on a moderate-sized match list.

Credits and subscriptions

Each analysis costs a small number of credits. You can buy credits in single top-ups or subscribe monthly. Monthly subscribers receive a 10% bonus on every purchase and unlock advanced AutoLineage features. Payments use Stripe.

View pricing plans
Plan Cost Features
Free Trial $0 / month
  • 200 complimentary credits for new registered users
  • Analysis of close DNA matches
  • AutoCluster
  • AutoTree
  • AutoSegment
  • CSV file-based AutoCluster
  • Reclustering MyHeritage AutoClusters
Subscriber (most popular) $5–50 / month
  • Everything in Free Trial
  • Analysis of close and more distant DNA matches
  • Hybrid AutoSegment
  • AutoKinship
  • AutoTree for FTDNA Y-DNA / mtDNA
  • AutoPedigree
  • AutoLineage (advanced features)
  • Full support

Adding an FTDNA profile

FTDNA is the only company that accepts profile registration on Genetic Affairs. Ancestry and 23andMe are no longer supported for automated retrieval.

The members site now includes a streamlined interface designed specifically for adding FTDNA profiles. You no longer need to provide your FTDNA password when adding a profile, your password (and 2FA code if enabled) is only entered when starting an analysis, making the process safer and more privacy-friendly.

  1. On the members landing page, click Register a new website.
  2. The wizard opens with the title Register FamilyTreeDNA Profile.
  3. Enter your FamilyTreeDNA kit number and the name of the tested person.
  4. Confirm with Register profile.
  5. A notice will appear about FTDNA's password policy update. Read it and click I understand, continue.
  6. Once verified, you arrive at the Websites / Profiles page.

Running an FTDNA analysis: the wizard

From the Websites / Profiles overview, click Start analysis next to the profile you want to run. A wizard guides you through every step:

  1. Select analysis: choose AutoCluster, AutoSegment ICW, AutoTree, or AutoKinship.
  2. Adjust cM range: set the upper and lower shared-cM thresholds.
  3. Match selection: run on all matches, or start with specific matches of interest.
  4. Password & 2FA: enter your FTDNA password and, if enabled, your 2FA code.
  5. Cost summary: review the credit cost before confirming and click Perform analysis. Results arrive by email as a ZIP file.
The new analysis wizard. Your password is entered at run-time only, encrypted locally, used to pull your match list, then discarded.

Two-factor authentication (2FA)

Genetic Affairs now fully supports FTDNA's two-factor authentication. When 2FA is enabled on your FTDNA account, simply enter your 2FA code in the wizard's password step.

⚠ Only the new wizard supports 2FA. The older profile page (accessible via Websites / Profiles in the top menu) does not support 2FA. If you prefer the old page, note this limitation.

Always unzip first

Your analysis results arrive by email as a ZIP file and appear in the notifications bell on the members site. Unzip before opening. The HTML file inside the ZIP contains links to other files in the same folder; opening it inside the ZIP produces broken-link errors.

Clustering background

Every clustering tool in Genetic Affairs builds on the same idea: people who share DNA with you and also share DNA with each other probably descend from a common branch of your family tree. That simple observation has been the engine behind genetic-genealogy clustering since 2018, and it is still the engine behind the modern AutoLineage workflow. This chapter explains how clustering works, why the colourful diagonal charts look the way they do, and how the technique has evolved from a hand-coloured spreadsheet into today's weighted clusters annotated with named ancestors.

DNA matches and shared matches

A DNA match is another tester who shares one or more DNA segments with you. Most testing sites express match strength as total centiMorgans (cM) shared, roughly, how much DNA you have in common. A close cousin might share 500–1000 cM with you; a distant cousin might share 20–50 cM.

A shared match: sometimes called an In Common With (ICW) match, is a third person who matches both you and one of your matches. Shared-match information is what makes clustering possible. Instead of just knowing that matches A, B, and C all match you, you also know that A and B match each other but A and C do not. That extra information is enough to separate your matches into branches.

If two of your matches also share DNA with each other, they most likely descend from a common branch of your tree. If they do not share DNA with each other, they probably descend from different branches. A typical human has roughly four grandparent branches' worth of DNA, so your matches tend to fall into a handful of distinct groups, one per ancestral line that contributed meaningfully to your genome.

How a cluster forms

A cluster chart is a square grid. Every row is one of your DNA matches, and so is every column, the same people on both axes. Each cell in the grid represents the intersection between two of your matches.

The clustering algorithm rearranges the clusters so that clusters whose members are densely connected to each other end up next to each other. The result is the characteristic diagonal of coloured squares: each square is a cluster, and the matches in that cluster form a dense little community of shared matches. Each cluster most often represents one of your ancestral lines.

AutoCluster chart showing distinct coloured clusters along the diagonal with grey shared-match cells between them
A typical AutoCluster chart. Every row and column is one of the tester's DNA matches. Coloured blocks along the diagonal are clusters, groups of matches who share DNA with each other as well as with the tester. Grey cells outside the diagonals flag matches that straddle more than one cluster.

A short history of DNA cluster analysis

The key idea, use shared matches to separate cousins into ancestral branches, has been stable for years. What has changed is the automation around it.

2018: The Leeds Method

In mid-2018, Dana Leeds published a remarkably simple procedure: take your closest AncestryDNA matches between roughly 90 and 400 cM, list them in a spreadsheet, and for each match, colour every shared match the same colour. Move down the list, pick a new colour for the next uncoloured match, colour all of their shared matches, and repeat. For most testers, four colours emerge, typically one per grandparent line. The technique is sometimes called DNA Color Clustering or the Color Cluster Method, but it is widely known as the Leeds Method.

Why the range of 90–400 cM? Lower than 90 cM and shared-match calls become unreliable; higher than about 400 cM and you run into close relatives who match you through multiple branches and would paint the whole tree the same colour. The Leeds Method works because in the 90–400 cM band, most matches are 2nd through 4th cousins, close enough for the shared-match relationship to be meaningful but distant enough to usually descend through a single grandparent.

The Leeds Method was a genealogy-community revelation precisely because it did not require a chromosome browser, a tree, or any segment maths. It worked on AncestryDNA, which provides none of those things, and it worked for adoptees and unknown-parentage cases as well as for people with rich family trees. Dana Leeds' original articles and follow-ups are linked in the references at the end of this chapter.

Late 2018: AutoCluster automates it

Evert-Jan Blom, the developer of Genetic Affairs, saw that the Leeds Method was essentially a community-detection problem and could be automated. In late 2018 he released AutoCluster, which did exactly what the Leeds Method did by hand, but automatically, for hundreds of matches at once, with a clean diagonal chart as output. Instead of the user colouring rows by inspection, a clustering algorithm reordered the matrix so that densely interconnected matches ended up next to each other, and each cluster received its own colour.

AutoCluster was initially available for AncestryDNA, FamilyTreeDNA and 23andMe.

One of the first FTDNA AutoCluster results for Roberta Estes, showing an early diagonal cluster chart with coloured blocks along the diagonal
One of the first FTDNA AutoCluster results, run for genealogist Roberta Estes shortly after AutoCluster launched in late 2018.

In 2019 it reached a vastly wider audience when MyHeritage integrated it directly into their platform.

Animated AutoCluster result embedded in the MyHeritage platform
AutoCluster embedded in MyHeritage.
MyHeritage AutoCluster chart with named matches on both axes and seventeen clusters along the diagonal in different colours
MyHeritage AutoCluster, integrated directly into the MyHeritage platform. The legend at right shows cluster sizes; grey cells outside the diagonals flag matches that straddle more than one cluster.

GEDmatch followed soon after as a Tier-1 tool. Ancestry took legal action in June 2020, formally demanding that Genetic Affairs stop automated retrieval of match data, ending third-party clustering support. Ancestry introduced its own match clusters in 2025.

2019: Linking clusters together

A first clustering pass groups closely-connected matches into individual clusters, but related clusters are not yet connected to each other. AutoCluster goes a step further: it examines the grey cells: matches that appear in more than one cluster, and uses them to determine which clusters are linked. Clusters that share enough inter-cluster connections are merged into a super-cluster: a larger grouping that may correspond to a grandparent line rather than a single ancestral couple.

AutoCluster chart showing clusters merged into super-clusters along the diagonal, each representing a broader ancestral branch
Clusters that share enough inter-cluster connections have been merged into super-clusters, each likely representing a broader ancestral branch.

2024–2025: Weighted clustering and large charts

The original AutoCluster treats every shared-match relationship as equal: either two matches share DNA or they don't. Weighted clustering, introduced in AutoLineage, uses the actual amount of DNA each pair of your matches shares with each other as input. Heavier connections (say, two cousins who share 120 cM with each other) pull matches more strongly together; lighter connections contribute less.

At the same time, AutoLineage has gained a large-chart mode that handles thousands of matches smoothly, so you are no longer limited to the 100-match charts of earlier AutoCluster runs. When the large chart is combined with MRCA labelling from imported trees, you end up with a full-tree panorama where every ancestral line is named.

From clusters to named ancestors

A cluster on its own is anonymous, it says "these people are related to you through one of your branches" without telling you which branch. Turning the cluster into a named ancestral line is where the other Genetic Affairs tools come in. AutoTree scans the trees attached to matches within each cluster and searches for common ancestors by comparing names, dates and places. AutoKinship goes further and predicts relationships from the amount of DNA the matches share with you and with each other, without needing any trees at all. AutoLineage combines both approaches and adds a crucial annotation step: when you have Most Recent Common Ancestor (MRCA) information recorded in the notes for your matches, each cluster can be labelled with the MRCA its members appear to descend from.

Large AutoLineage cluster chart where each coloured cluster is labelled with the name of an ancestral couple or lineage
AutoLineage's annotated cluster view. Every cluster has been auto-labelled with the ancestral line its members most likely descend from, "AbrahamJamesJonas/Ragan line", "Ragan/DanielCynthia line", "Weaver/Jack line", and so on. The same information that used to require hours of manual tree comparison is now surfaced automatically during clustering.

When a cluster is labelled in this way, reading the chart becomes a different exercise. Instead of asking "which branch could this be?", you ask "what does my Ragan/Daniel-Cynthia cluster look like, and which matches in it should I contact first?" The clustering has become a starting point for research, not an end in itself.

Shared cM between matches: a new dimension

Early clustering relied solely on whether two of your matches share DNA with each other, a yes-or-no signal. A significant step forward came when testing companies began publishing not just who your shared matches are, but how much DNA those shared matches share with each other. MyHeritage, GEDmatch and 23andMe provided this inter-match cM data from early on. Ancestry added it more recently, and FamilyTreeDNA has now followed, meaning that today, for the first time, all major testing companies supply this information.

This fundamentally changes what is possible inside a cluster. Instead of knowing only that matches A, B and C belong together, you now know that A and B share 220 cM with each other, while B and C share only 38 cM. That extra layer of evidence makes it possible to identify relationships within clusters: not just which branch a cluster represents, but how the individuals in it relate to each other and to you.

Genetic Affairs visualises the inter-match shared cM directly on the cluster chart: each coloured cell carries the cM value between the two matches it represents. The matrix below the diagonal makes the relationship structure visible at a glance.

AutoCluster chart with cM values shown inside the coloured shared-match cells, revealing the relationship structure within each cluster
AutoCluster chart showing inter-match cM values inside each coloured cell. The values let you read the relationship structure within a cluster directly from the chart.
Matrix of shared cM values between every pair of matches in a cluster, colour-coded by strength
The inter-match cM matrix that feeds AutoKinship. Each cell shows how much DNA two of your matches share with each other, the raw material for reconstructing relationships within the cluster.

AutoKinship: reconstructing trees from shared DNA

Having inter-match cM data for every pair of matches in a cluster opens the door to automated relationship prediction. AutoKinship takes the full matrix of shared cM values and infers a family tree that is consistent with those numbers, without needing any attached trees or known ancestors. It works on any company that provides inter-match cM data, including GEDmatch.

AutoKinship cluster result from GEDmatch showing predicted relationship nodes connecting DNA matches
AutoKinship result for a GEDmatch cluster. Predicted relationship nodes connect the DNA matches based purely on the shared cM values between them.
Reconstructed AutoKinship family tree with numbered ancestral nodes and DNA matches at the leaves annotated with cM values
A reconstructed AutoKinship tree. Numbered nodes are predicted ancestors; DNA matches sit at the leaves with their cM to the tester.

AutoLineage: adding MRCA information and known relationships

AutoLineage takes reconstruction further by allowing you to feed in known relationships and MRCA information from imported trees. Where AutoKinship predicts structure from numbers alone, AutoLineage lets you anchor that structure to real ancestors, confirming, refining or extending the predicted tree with documented genealogical evidence.

Reconstructed tree in AutoLineage with a common ancestral couple at the root and DNA match nodes annotated with cM values and MRCA labels
A reconstructed tree based on DNA in AutoLineage. DNA matches appear at the leaves with their cM values; MRCA annotations link the genetic evidence to documented genealogy.

Three things to keep in mind

Clustering is powerful but not magical. Three caveats are worth internalising:

References

The following blog articles, listed chronologically, trace the arc of cluster analysis from Dana Leeds' original method to today's weighted clustering with MRCA annotations.

The Leeds Method

AutoCluster automates the spreadsheet

Understanding and interpreting clusters

Modern clustering: weighted, large, and MRCA-annotated

Tool pages

AutoCluster

What it does

AutoCluster groups your DNA matches into coloured shared-match clusters that typically represent branches of your family tree. Each coloured cell sits at the intersection between two matches who both match you and each other. Cells of the same colour, plus the grey cells that link them, form a cluster likely descended from a common ancestral couple.

AutoCluster chart showing distinct coloured clusters (orange, green, red, purple, pink) with grey shared-match cells between them
A typical AutoCluster output. Each coloured block is one cluster of closely-linked shared matches; grey cells are the shared-match bridges between clusters.

Where it runs

Step-by-step (FTDNA)

  1. From your Profiles page, click Start analysis and select AutoCluster.
  2. Set max and min shared cM (e.g. 600 / 50), minimum cluster size (2 or 3), and optionally a minimum largest-segment size. Tick Enable AutoTree in the same run (recommended).
  3. Choose match selection, all matches, or start with specific matches of interest.
  4. Enter your FTDNA password and your 2FA code if enabled.
  5. Review the credit cost and click Perform analysis. The ZIP arrives via email and can also be downloaded from the notifications panel in the top right of the site.
Don't set the minimum cM too low on your first run. The server has a time limit. Going too low can reduce usable matches because the job times out before all shared-match data is fetched.

Reading the output

The ZIP contains a main HTML chart, an Excel spreadsheet with the same data (useful when the HTML is huge), a gephi folder with nodes.csv and edges.csv (used later by AutoLineage), chromosome browser files, and a secondary tables-only HTML.

On the chart, grey cells are shared matches spanning more than one cluster. Clusters are sorted using these greys, which produces the characteristic super-cluster shape. Below the chart, a sortable table lists each match with shared cM, cluster, tree size, predicted relationship and notes.

Starting with specific matches of interest

Instead of running on all matches, you can focus a run on specific matches of interest, useful for zooming in on the branches you share with one particular cousin.

  1. Run a normal AutoCluster first, unzip, and find the ResultID2 of the match of interest in the matches CSV (in the gephi folder).
  2. Run AutoCluster again, choose Start with specific matches of interest, and paste the ResultID2.
  3. To exclude specific matches: !XYZ removes the match; !!XYZ removes the match and their entire branch.

Tips

Take it further with AutoLineage. AutoLineage supports match imports for most major testing companies, giving you a single workspace for all your data. From there you can run custom clustering analyses with adjustable parameters, explore advanced visualizations such as UMAP and large-chart mode, label each cluster with a Most Recent Common Ancestor, and use the built-in search to find specific matches or surnames across your entire cluster chart.

AutoTree

What it does

AutoTree reads every available tree attached to matches in each cluster, clusters tree persons by surname → first name → birth/death year, and identifies the most likely common ancestors. It then reconstructs partial genealogical trees from each common ancestor down to the matches. AutoTree works even if you don't have a tree, it's especially useful for adoptees.

Where it runs

AutoTree runs directly only on FamilyTreeDNA. For 23andMe, MyHeritage and Ancestry, use AutoLineage instead; AutoLineage's common-ancestor detection is the modern replacement.

Step-by-step (FTDNA)

AutoTree can be run on its own or combined with AutoCluster in a single pass (recommended, you get both results for one job).

  1. From your Profiles page, click Start analysis and select AutoTree: or tick Enable AutoTree when setting up an AutoCluster run.
  2. Set the cM range parameters for the matches whose trees you want to read.
  3. Choose match selection, all matches, or start with specific matches of interest.
  4. Enter your FTDNA password and your 2FA code if enabled.
  5. Review the credit cost and click Perform analysis. The ZIP arrives via email and can also be downloaded from the notifications panel in the top right of the site.

Output

The top-level HTML report includes an overview table. For each cluster, three links take you deeper:

Reconstructed trees colour-gradient matches by cM (the strongest matches pop visually). The tester appears in green. Hover an edge to highlight the same person in multiple trees; click a cM value for Shared cM Project 3.0 v4 relationship probabilities.

AutoTree reconstructed tree for cluster 1, showing common ancestors at the top and DNA matches at the leaves with cM values on each branch
AutoTree output for cluster 1: a reconstructed tree descending from identified common ancestors, with DNA matches at the leaves and cM values on each branch. Source: Roberta Estes, February 2026.

Tips

Take it further with AutoLineage. AutoTree identifies common ancestors from attached trees, but AutoLineage goes beyond that by letting you also import your own GEDCOM, WikiTree profiles and One2Tree saves, adjust name-matching strictness and birth-year tolerance, and run multiple passes side by side. You can also combine Ancestry, MyHeritage and GEDmatch data in a single workspace. For Ancestry and other vendors where AutoTree cannot run directly, AutoLineage is the only path to common-ancestor analysis.

AutoSegment

What it does

AutoSegment groups matches by overlapping DNA segments on specific chromosomes, a different question from "do these people match each other?" It identifies who shares the same stretch of a chromosome.

AutoSegment ICW cluster chart listing segment clusters with overlapping segment bars per chromosome and match metadata
AutoSegment ICW output: segment clusters on the same chromosomal region, with cM values, match names and links to DNA Painter.
Overlapping segments are not automatically triangulated segments. Only GEDmatch publishes true triangulation data. For GEDmatch runs, Genetic Affairs uses that file to verify overlaps, dramatically improving accuracy. For FTDNA, triangulation is inferred from ICW (In Common With) data, a reasonable approximation, but not a guarantee. When multiple overlapping segments are shared between two matches that are shared match, it is possible that not all of them truly triangulate. For other sources, and as a general rule, read AutoSegment results as strong leads rather than proofs.

The offline-file situation

As of April 2026, most offline segment files can no longer be downloaded from the testing companies. 23andMe's security changes blocked bulk segment export; MyHeritage and others have tightened access.

The offline AutoSegment workflow is still supported, but in practice it's now mainly useful to people who already saved those files on their computer in earlier years. New users starting in 2026 will generally only be able to use AutoSegment on data still obtainable, most reliably FTDNA Family Finder and GEDmatch Tier 1 exports.

Step-by-step: automated AutoSegment (FTDNA)

The automated AutoSegment works exclusively with FTDNA profiles registered on Genetic Affairs.

  1. From your Profiles page, click Start analysis and select AutoSegment ICW.
  2. Set the minimum segment overlap threshold (10 cM is a solid starting point). Optionally enable pile-up filtering.
  3. Choose match selection, all matches, or start with specific matches of interest.
  4. Enter your FTDNA password and your 2FA code if enabled.
  5. Review the credit cost and click Perform analysis. The ZIP arrives via email and can also be downloaded from the notifications panel in the top right of the site.

Offline workflow (for files you already have)

If you still hold older CSVs: on the landing page, click Run AutoSegment, choose the source company, and upload match and segment files. For GEDmatch, upload the Triangulation CSV too, this unlocks the triangulation-verified version. You can add paternal/maternal labels manually in the last column of the match CSV before upload.

Hybrid AutoSegment

Hybrid AutoSegment combines segment data across multiple companies in one chart. FTDNA liftover harmonises coordinates. Labels on the chart show each company's contribution, and a table shows match counts per company. Hybrid AutoSegment shines when you already hold data from several companies.

Opposite-sides detection & AutoSegment Split

When overlapping segments do not share ICW/triangulation status, the matches likely descend from opposite parental sides. AutoSegment flags this inside each cluster. On GEDmatch, AutoSegment Split takes this further with admixture bars per segment, especially useful for adoptees or for testers with one parent from an under-represented population.

AutoSegment cluster with some rows coloured differently, indicating two separate parental-side sub-clusters within one overlapping segment region
A single overlapping-segment region split into two colours, the tool has detected matches on opposite parental sides.
AutoSegment Split admixture bars showing North East European, Caucasian, Steppe, Indian and Ancestor components per segment
AutoSegment Split (GEDmatch Tier 1): per-segment admixture bars surface ancestry patterns that whole-genome admixture would mask.

Output

Tips

AutoKinship

What it does

AutoKinship predicts family trees from DNA evidence alone. It does not need you or your matches to have trees. Using shared-cM amounts between you and every pair of matches, it infers relationship pathways and draws them as an interactive kinship chart.

AutoCluster result from FamilyTreeDNA showing a coloured cluster that can be used as input for AutoKinship
An example cluster from FamilyTreeDNA. A cluster like this, with its associated inter-match cM values, is the starting point for an AutoKinship run.
Reconstructed AutoKinship tree for cluster 1 with 9 persons and branching generations, numbered 1 through 7, descending to DNA matches with cM labels
A reconstructed kinship tree. Numbered boxes are predicted ancestral nodes; DNA matches sit at the leaves with their cM to the tester.

Where it runs

Step-by-step (FTDNA)

  1. From your Profiles page, click Start analysis and select AutoKinship.
  2. Set thresholds: min/max shared cM, minimum largest segment, and minimum cluster size. Start conservatively.
  3. Choose match selection, "top matches within the selected range" is a solid first-run default.
  4. Enter your FTDNA password and your 2FA code if enabled.
  5. Review the credit cost and click Perform analysis. The ZIP arrives via email and can also be downloaded from the notifications panel in the top right of the site. Save and unzip the report.

What's inside

Understanding the predictions

AutoKinship doesn't see birth dates, so generational placement can be off (an "uncle" might really be a 2C1R). The structure of the tree, who links to whom, is usually right. Use AutoLineage afterwards to feed in known relationships and pin generations.

Tips

Take it further with AutoLineage. The standalone AutoKinship tool works from DNA numbers alone. Inside AutoLineage you can pre-define known relationships as hard constraints, feed in MRCA information from common-ancestor analysis, and set generational offsets, so the probability engine only considers trees consistent with both the DNA and your genealogy. The result is a tighter, better-anchored tree than the standalone run can produce.

Common ancestors background

This chapter is a long-form companion to the AutoTree and AutoKinship tool pages. It explains how Genetic Affairs finds common ancestors, the algorithms, the history behind them, and how the approach has evolved between 2018 and 2026. If you want to run the tools, the AutoTree and AutoKinship tabs cover the step-by-step instructions. If you want to understand what happens inside the box, and why the standalone tools work the way they do but AutoLineage exists the way it does, read on.

The short version. The standalone AutoTree and AutoKinship tools are automated, opinionated pipelines with fixed internal parameters, they don't expose knobs for name-matching strictness, birth-year tolerance, clustering weights, or external tree sources. Everything is tuned for "press the button and get a report." AutoLineage was built as the workbench successor: the same algorithms, but with every parameter exposed, plus the ability to import your own GEDCOMs, One2Tree saves, WikiTree matches and trees downloaded for all matches at GEDmatch. The automated tools remain the fastest way to get a first look; AutoLineage is where you refine.

The common-ancestor problem, in plain language

When you open a list of DNA matches, every stranger on that list shares DNA with you because, at some point in the past, you share a most recent common ancestor (MRCA) or an ancestral couple. The genealogical question is: which ancestor? Answering it by hand means chasing each match's attached tree back generation by generation, matching names and places between trees, and trying to decide whether two "John Smith b. ~1820" entries refer to the same man.

Doing that across hundreds of matches is impossible in any reasonable amount of time. Automating it is hard for three interlocking reasons:

AutoTree's answer is a three-step clustering pipeline that we'll look at in detail below. AutoKinship sidesteps the problem entirely, it reconstructs the tree from the amount of DNA each pair of matches shares with each other, using no genealogy at all.

Timeline of common-ancestor discovery at Genetic Affairs

The Genetic Affairs tools did not arrive in one release. Each tool addressed a different piece of the common-ancestor problem, and each built on what came before.

Date Milestone
Late 2018 AutoCluster launches. No tree integration yet, it groups matches by shared-match relationships only. Establishes the visual paradigm (diagonal cluster chart) that everything else builds on.
Feb 2019 MyHeritage licenses AutoCluster and integrates it natively, exposing the method to millions of testers.
Dec 2019 AutoTree launches. First release of the three-step surname → first-name → date clustering to identify common ancestors inside the trees of each cluster's matches. ZIP output includes GEDCOM exports of every reconstructed tree.
May 2020 AutoPedigree launches, AutoTree output is used to synthetically generate descendants and rank candidate positions where an unknown tester might fit inside a candidate tree, scoring each position against the shared cM between the tester and the matches.
May 2020 Ancestry issues a cease-and-desist. Automated retrieval of Ancestry matches, trees and ThruLines data stops on 1 June 2020. The block has never been lifted.
Apr 2021 GEDmatch (Verogen / QIAGEN) licenses AutoCluster, AutoTree and AutoPedigree as part of Tier 1.
Oct 2021 AutoKinship launches. First tool to reconstruct genetic family trees from shared-cM amounts alone, no trees required from tester or matches. Initially automated only for 23andMe, because only 23andMe (at the time) exposed inter-match cM.
Feb 2022 AutoKinship is added to GEDmatch Tier 1. The GEDmatch implementation uses true triangulation data and fully-identical-region (FIR) data to discriminate full siblings from half siblings, an accuracy boost unique to GEDmatch.
Mar 2024 AutoLineage launches. Rebuilds AutoTree and AutoKinship as interactive modules inside a browser workbench, with user-adjustable parameters, external GEDCOM import, and cross-company match pools. This is the first Genetic Affairs product where users can tune the algorithms themselves.
~2025 FTDNA's Matrix upgrade exposes match-to-match cM, enabling automated AutoKinship for FTDNA for the first time.
Feb 2026 Modern workflow consolidates: automated FTDNA AutoKinship + AutoTree first, then AutoLineage for refinement with GEDCOMs, WikiTree imports, and known-relationship pinning.

How AutoTree identifies common ancestors: the three-step algorithm

AutoTree operates per cluster. Inside one cluster of shared matches, it pools every person who appears in any attached tree, a single cluster of ten matches might pool 3,000–15,000 tree persons, and then searches that pool for MRCAs using three consecutive network-clustering passes. The same common-ancestor identification is also performed across all trees from all clusters combined, so an MRCA can be found between matches that sit in different clusters.

Step 1: Surname similarity network Smith Smyth Smithe Blom cluster similar surnames Step 2: First name within surname cluster Johan Jan John Mary cluster similar first names Step 3: Year birth & death John Smith 1820 John Smith 1822 John Smith 1860 split by birth year Result: a single identified "person" across multiple match trees → an MRCA candidate
The AutoTree three-step clustering pipeline, reconstructed from the official Genetic Affairs methodology. Each step narrows the pool: surnames first, then first names within each surname cluster, then birth/death years to split genuine collisions.

Step 1: Surname similarity network

AutoTree collects every surname appearing in every tree across the cluster and builds a network where each surname is a node and edges link surnames that look alike. A network-clustering algorithm then groups similar surnames into surname clusters, so that Smith, Smyth and Smithe end up together even when two match trees spell the same ancestral surname differently.

Step 2: First-name clustering within each surname group

Within each surname cluster from Step 1, a second similarity network clusters first names. This is how AutoTree decides whether "Johan Smit" in tree A and "Jan Smit" in tree B are the same man. The same caveats about locale-specific name behaviour apply here too.

Step 3: Birth/death year disambiguation

The third pass splits first-name clusters by birth and death year with a fixed tolerance. Two different John Smiths born forty years apart remain distinct; two John Smiths born two years apart are treated as the same person. The year tolerance is not exposed in the standalone tool, AutoLineage's equivalent exposes it with a default of ±2 years.

From identified persons to MRCAs

Once the three passes have identified a set of "canonical" tree persons that appear in two or more match trees, AutoTree walks each match's tree upward from those persons and flags ancestor pairs (or single ancestors) that serve as the MRCA for two or more matches in the cluster. These become the cluster's common-ancestor candidates. A final pass extends the search across cluster boundaries, if the same MRCA turns up in three different clusters, that's a strong signal the clusters share a branch.

AutoTree reconstructed tree output showing an MRCA couple at the top, their descendants branching down, and the DNA matches highlighted at the leaves with coloured cM labels
Official example of an AutoTree reconstructed tree. The MRCA couple sits at the top; descendants branch down to the DNA matches at the leaves. The yellow-to-red colour gradient encodes cM strength, so the strongest matches pop visually. Source: geneticaffairs.com/autotree.html.

Common locations as a second signal

In parallel with the ancestor search, AutoTree runs a location analysis: birth, marriage and death places from all match trees are geocoded and clustered using distance thresholds. When several matches in a cluster have tree persons who were born within a small radius of each other, think one rural Dutch parish, or a single Friesian village, that geographic concentration is an independent line of evidence for a shared ancestral line. Location clusters are reported alongside ancestor clusters in the output.

How AutoKinship reconstructs trees from DNA alone

AutoKinship asks a different question from AutoTree. AutoTree asks "what MRCA explains why these people appear in each other's trees?"; AutoKinship asks "what tree topology best explains the amount of DNA these people share with each other, even if no trees exist?" It needs genealogy from no one. It only needs the cM matrix.

The input: a full inter-match cM matrix

For each cluster, AutoKinship reads the shared-cM between the tester and every cluster member and between every pair of cluster members. That second number, how much DNA your matches share with each other: is the critical one. Without it, you only know that matches A, B and C belong together; with it, you know that A and B share 220 cM while B and C share only 38 cM, which is enough evidence to say A and B are close cousins and C is a more distant relative on the same line.

Input: shared-cM matrix A B C D A B C D 220 45 62 220 38 54 45 38 180 62 54 180 every pair, every direction Probability engine Nicholson simulated 500 000 pairs per relationship type + vendor-specific tables (Shared cM fallback) Output: ranked trees 1. top tree · P = 0.42 2. P = 0.18 3. P = 0.11 … up to ~10 per cluster AutoKinship uses no trees, the shared-cM matrix alone determines the ranked topologies.
AutoKinship's workflow in the abstract: read the cM matrix, enumerate candidate tree topologies, score each against the simulated probability distribution for every cM value, and return the top-ranked trees.

The probability engine

The core statistical backbone is Brit Nicholson's simulated shared-cM distributions, published in April 2021, 500,000 simulated pairs per relationship type, producing a probability distribution of shared-cM for every relationship from parent-child out to distant cousins. AutoKinship layers vendor-specific probability tables on top (MyHeritage, 23andMe, GEDmatch each have their own measurement biases).

Tree enumeration and ranking

For each cluster, AutoKinship generates candidate tree topologies that could plausibly produce the observed cM pattern. For each candidate tree it multiplies the probability of every observed cM value (both tester-to-match and match-to-match) under that tree, the product is the tree's combined likelihood. Trees are ranked by likelihood and roughly the top ten are returned, each with a ratio showing how much more likely it is than the next-lower-ranked tree.

Reconstructed AutoKinship tree showing predicted numbered ancestral nodes at the top and the DNA matches at the leaves, each annotated with cM values
A reconstructed AutoKinship tree. Numbered nodes are predicted ancestors; the tester and the matches sit at the leaves with their inter-match cM values. The topology is derived from DNA alone, no genealogical trees were consulted. Source: geneticaffairs.com/autokinship.html.

Known failure modes

AutoKinship's weakness is the opposite of AutoTree's: it has perfect cM evidence but no birth-year evidence, so it cannot tell you which generation a person belongs to. Matches are frequently placed at the wrong generational level, the structure of the tree (who links to whom) is usually right, but the vertical placement can be off by a generation. In a cluster dominated by 1C2Rs, for example, the tool may rank trees that place the true 1C2Rs as 2C or 2C1R, because the cM values are also consistent with the shallower interpretation.

Other practical limits:

AutoPedigree

AutoPedigree (May 2020) takes AutoTree's MRCA output and uses it to generate candidate positions where an unknown tester might fit inside a reconstructed tree. For each MRCA couple supported by enough matches (default: three matches at 30–40 cM or more sharing a common-ancestor tree), AutoPedigree synthetically generates descendants and ranks each candidate position by multiplying the Shared cM Project probability of each match's observed cM against the hypothesis, the resulting combined odds-ratio ranks the candidates, with badges marking the top five (green), other viable candidates (orange) and impossible positions (red).

AutoPedigree-style pedigree chart with a common-ancestor couple at the top, descendant generations drawn as rectangles, and a tester placeholder position marked inside the tree
AutoPedigree inherits AutoTree's reconstructed tree and adds synthetic descendants, then ranks candidate positions the tester could occupy inside that tree. Especially powerful for adoptees and unknown-parentage cases, because the tester's own tree is deliberately excluded from the hypothesis generation. Source: geneticaffairs.com/autopedigree.html.
Score-direction trap. AutoPedigree uses lower-is-better scoring, smaller numbers indicate more likely positions. This is the opposite of some other hypothesis tools, so pay attention to the direction of the ranking when comparing results.

Why AutoLineage was built: a configurable workbench

AutoLineage (March 2024) is the successor design. Rather than add knobs to the legacy standalone tools, the algorithms were rebuilt as modules inside a browser workbench where every parameter is exposed and, critically, where the match pool itself can be augmented with external data. The same AutoTree three-step pipeline and the same AutoKinship probability engine run under the hood, but you now decide how they run.

Standalone AutoTree / AutoKinship automated, opinionated, fixed FTDNA / GEDmatch match pool Fixed algorithm no strictness slider no GEDCOM import ZIP report by email one shot · no iteration no known-relationship pins no external data no parameter tuning AutoLineage workbench configurable, cross-vendor, iterative Augmented match pool FTDNA · MyHeritage · Ancestry GEDmatch · 23andMe (legacy) + your GEDCOMs + One2Tree · WikiTree · all-matches Same algorithms, exposed strictness · year tolerance weighted clustering known-relationship pins Interactive results · iterate re-run without paying again
Standalone tools vs. AutoLineage. The algorithms are the same under the hood; what's different is the match pool (one vendor vs. five, plus external trees) and the parameter surface (none vs. all).

External trees: the biggest difference

In a standalone AutoTree run, the tool reads only the trees attached to your matches inside that single vendor. If your FTDNA matches don't have trees, the MRCA search has nothing to work with. AutoLineage lifts this ceiling in five directions:

Exposed parameters: finally

AutoLineage's Find Common Ancestors dialog exposes what the standalone tool hides: name-matching strictness, year tolerance (default ±2 years), location distance thresholds, and the specific person fields to compare on. You can run several passes with different settings and compare the resulting MRCA lists side-by-side. The AutoKinship wizard inside AutoLineage exposes max generations, trees per iteration, final trees to keep, and, most importantly, checkboxes for Include known relationships and Include MRCA relationships, so documented genealogy and DNA evidence pin each other in one pass.

Known relationships: pinning what you already know

Before you run AutoKinship inside AutoLineage, you can mark relationships you already know for certain, this match is your first cousin, that match is a sibling of your mother, these two matches are brothers. The probability engine then treats those relationships as hard constraints and only enumerates trees consistent with them. Generational-direction pins solve the other half of the problem: tell AutoKinship which match is older than which, and the ambiguous-generation ranking problem largely goes away.

Vendor support for common-ancestor work, over time

The table below summarises how each vendor fit into the common-ancestor pipeline at each stage of the history. Read it top-to-bottom for a sense of how dramatically the picture has shifted in seven years.

Era FTDNA MyHeritage Ancestry 23andMe GEDmatch
2019–2020
AutoTree era
Automated Licensed AutoCluster, AutoTree via GA site Automated until June 2020 No trees to parse Before Tier 1 deal
2021
AutoKinship launches
AutoCluster/AutoTree automated; AutoKinship not yet (no inter-match cM) AutoCluster native; AutoKinship manual entry AutoKinship automated (first vendor) AutoCluster/AutoTree Tier 1; AutoKinship not yet
2022–2023
GEDmatch Tier 1
AutoTree automated, AutoKinship still manual Unchanged AutoKinship automated Full suite including AutoKinship with FIR / triangulation
2024
AutoLineage launches
Full automated suite on GA site Via AutoLineage import Via AutoLineage Copy-Paste Wizard & Pro Tools 23andMe security changes end automated retrieval Full Tier 1; AutoKinship from GEDmatch feeds AutoLineage
2025–2026
Current
Fully automated, including AutoKinship after FTDNA Matrix upgrade AutoLineage integration; legacy AutoClusters accepted AutoLineage Copy-Paste Wizard Legacy data only Tier 1 remains the gold standard; drives exhaustive MRCA analysis up to 7 500 matches

Practical conclusion: which tool when?

A note on limits. The tools are superb lead generators, but the leads are candidates to verify against original records, baptisms, censuses, marriages, before you rely on them. Neither AutoTree nor AutoKinship, and not even AutoLineage with every knob set, replaces the genealogist's evidence work. What they replace is the unworkable hour-count it used to take to find the leads worth verifying.

References for this chapter

The articles below are the ones cited or drawn on directly for this deep-dive chapter. Fuller blog archives are listed under the Help tab.

Algorithmic and conceptual

Launch articles: how each tool was received

Workbench era: AutoLineage & common ancestors at scale

AutoLineage: the flagship tool

AutoLineage is where everything comes together. It is the only Genetic Affairs tool that is genuinely cross-company, combine FTDNA, MyHeritage, Ancestry, 23andMe and GEDmatch data for the same tester in one profile, import trees from any source, cluster and re-cluster, detect common ancestors, and produce a refined, blended tree.

A large clustering visualization with cluster annotations based on MRCA data and demonstrating the search functionality.

Prerequisites

What AutoLineage is

AutoLineage uses advanced clustering to analyse your DNA matches and trees, helping you identify common ancestors across multiple DNA testing companies. It's particularly strong at three things:

  1. Organising matches: cluster by shared matches, re-cluster with different parameters, explore cluster-by-cluster rather than drowning in one giant list.
  2. Finding common ancestors: read the trees of your matches in bulk and search systematically for persons who appear in multiple trees.
  3. Reconstructing trees: blend AutoKinship's DNA-driven tree predictions with the documented MRCAs you identify.

How it differs from the other tools

AutoCluster, AutoTree and AutoKinship each do one thing automatically. AutoLineage is the workbench: you control parameters, iterate, and combine evidence. Advanced features require an active monthly subscription; the automated AutoKinship run that feeds AutoLineage runs on free credits.

What works with AutoLineage

Company Import options
Ancestry
FamilyTreeDNA
  • nodes.csv + edges.csv from an automated run
  • CSV files from DNA Gedcom
MyHeritage
  • CSV file from MyHeritage (no longer available)
  • CSV files from DNA Gedcom
  • Locally saved HTML files
  • AutoCluster HTML file
23andMe
  • CSV file from 23andMe (no longer available)
  • CSV files from DNA Gedcom
  • Locally saved HTML files
  • AutoCluster HTML file
GEDmatch
  • nodes.csv + edges.csv from an automated AutoCluster endo run (recommended)
  • One-to-many HTML file
  • One-to-many CSV file
All vendors AutoLineage backup files (matches and shared matches)

The Copy-Paste Wizard

Added in late 2025, the Copy-Paste Wizard transformed the Ancestry workflow.

Copy-Paste Wizard dialog showing parsed DNA matches with a live log of identified matches, existing vs. newly imported counters, and Save/Close buttons
The wizard parses matches live as you paste and shows running counts of existing vs. newly imported matches.
  1. Create a Generic DNA test under your AutoLineage profile.
  2. Under Import Matches, click Copy Paste Wizard.
  3. Open your Ancestry match list in another tab. In the URL, change ItemsPerPage=50 to ItemsPerPage=100 to double capture per paste.
  4. Ctrl-A / Cmd-A, copy, click the blue Paste Data button in the wizard and paste. Counters update live.
  5. Repeat for shared-match pages, the wizard auto-adds matches it hasn't seen, even low-cM ones.
  6. If nothing appears after pasting, re-select and copy from the very top of the source page, a quirk of Ancestry's layout occasionally hides the data.

The main workflow

Phase 1: Create Profile and register DNA test

  1. Open members.geneticaffairs.com/autolineage and click New Profile.
  2. From the profile, click Register DNA test. Choose the company (FTDNA, 23andMe, MyHeritage, Ancestry, GEDmatch, or Generic).
  3. Save. You arrive at the DNA Test Overview page. One profile can hold multiple DNA tests.

Phase 2: Import matches

  1. Click Import Matches. Options (vary by company): CSV, HTML, Copy-Paste Wizard, or FTDNA nodes.csv from an AutoKinship report.
  2. A dialog confirms how many matches were loaded.
DNA test overview page in AutoLineage with Import Matches and Import Shared Matches panels and a blue Copy Paste Wizard button
The DNA Test Overview page. Import matches on the right, shared matches below. Copy-Paste Wizard is available for both.

Phase 3: Import shared matches (ICW)

  1. Click Import Shared Matches. MyHeritage: AutoClusters HTML. FTDNA: edges.csv. Ancestry/Generic: paste via Copy-Paste Wizard.
  2. The ICW column fills in with shared-match counts.

Phase 4: Cluster

  1. Click Clustering. Set min/max cM, weighted vs. unweighted, inter-match shared-cM cutoff, cluster density, colour scheme.
  2. Click Start Clustering. The chart appears in AutoLineage.
AutoLineage Start clustering wizard with cM range, shared-match counts, sparse/normal/dense/bit-dense options and colour scheme selection
The clustering wizard. Set cM range, density, shared-match threshold and colour scheme, then Start Clustering.

You can re-cluster instantly with different parameters without starting a new job. A huge orange cluster from the automated run may resolve into three coherent sub-clusters here.

Table of clustering analyses with DTC, DNA test, zoom, match count, min/max cM, type (weighted/unweighted) and dates
Every clustering run is saved, compare several runs side-by-side (weighted vs. unweighted, different cM cutoffs) from the Clustering Analyses table.

Phase 5: Import trees

  1. Go to Tree ManagementImport Trees and choose one or more of the import options below.
Import option Details
GEDCOM tree
How to get a GEDCOM

GEDCOM is a universally-accepted file format for family tree files. The One2Tree Chrome plugin allows users to download trees from matches of Ancestry, MyHeritage and FamilyTreeDNA.

GEDmatch GEDCOM tables saved as HTML (2 files)
How to save the GEDmatch GEDCOM files

Navigate to the one-to-many-full and select 7500 matches. Click on Search. Save the resulting page to your hard drive (default file name: One-to-Many Tool - Full Version _ GEDmatch.html). Next, select all kits with a GEDCOM by clicking Select all with GEDCOMs and click Visualization options. Visit the GEDCOM tab and run the Find matching GEDCOMs tool. Wait until the table has loaded and save the file as HTML (default file name: kit_gedcom_match2.php.html). Then run the Find matching GEDCOMs (anc) tool. Wait until the table has loaded (this may take some time) and save as HTML (default file name: kit_gedcom_anc2.php.html). To load all trees, first import the 7500 matches into AutoLineage, then select both kit_gedcom_match2.php.html and kit_gedcom_anc2.php.html to import the trees. One file contains the actual GEDCOM data; the other links the kit number to the tree and its root person.

CSV file from DNA Gedcom
How to export from DNA Gedcom

In DNA Gedcom, export the results for a profile. For the tree file, select the CSV file that starts with a_ followed by the name of the selected profile.

Backup file from AutoLineage
How to use an AutoLineage backup

In AutoLineage, create a backup of your tree and use this option to import those matches.

Backup file from AutoLineage (automated linking)
How automated linking works

In AutoLineage, create a backup of your tree that includes trees linked to DNA matches. When importing this file, matches with the same identifier will be automatically linked to the corresponding trees.

FTDNA HTML files in the matches folder
How to import FTDNA match HTMLs

Run the AutoKinship analysis. Once finished, save the ZIP file to your local drive and unzip it. Select all HTML files in the matches folder. Tree information linked to each FTDNA match will automatically be imported and linked to the corresponding match.

GEDmatch HTML files in the AutoKinship matches folder
How to import GEDmatch AutoKinship match HTMLs

Run the AutoKinship analysis (available for Tier 1 users) and select the 500 kits option. Once finished, save the ZIP file to your local drive and unzip it. Select all HTML files in the matches folder. Tree information linked to each GEDmatch match will automatically be imported and linked to the corresponding match.

WikiTree compact tree HTML (saved page)
How to save a WikiTree compact tree

On WikiTree, open the profile of the person you want as the tree root. Click the compact tree (pedigree) icon next to their name to open the Compact Tree page. In your browser, save that page as an HTML file (File → Save Page As). Then select the saved file here to import all available ancestors.

Tip. Name GEDCOM files with the match's name and shared cM, AutoLineage uses the filename to guess which DNA match the tree belongs to.
Tip. Tools like the One2Tree Chrome extension make gathering trees from Ancestry and MyHeritage much easier.

Phase 6: Link your own tree

  1. In your profile pane click Link to Existing Tree.
  2. Select the tree, pick yourself as the root person, save. Your tree now participates in the common-ancestor search.

Phase 7: Find common ancestors

  1. Open Find Common Ancestors from the Profile perspective. Adjust name-matching strictness, birth/death year tolerance (default ±2 years), and comparison parameters.
  2. Run. A dialog reports how many trees and MRCAs were found.
  3. Filter reconstructed trees by common ancestor, tree, or linked DNA match.
Common Ancestors table with filter controls showing ancestor names, surnames, birth and death years and places, each with a checkmark to include in the filtered view
Filter reconstructed trees by a specific common ancestor, tree, or DNA match to focus your investigation.

If expected matches don't converge on an MRCA, it's usually due to small tree inconsistencies, a missing middle name, a birth year off by three, a maiden-vs-married surname. Fix one tree and re-run.

Phase 8: Refined AutoKinship inside AutoLineage

The standalone AutoKinship tool works from shared-cM numbers alone. Running AutoKinship inside AutoLineage is more powerful because you can tell it what you already know: mark a match as a known first cousin, flag two matches as siblings, or set generational offsets. You can also feed in MRCA information gathered in Phase 7, so the probability engine treats documented ancestors as fixed points and only considers trees consistent with both the DNA and the genealogy. The result is a tighter, better-anchored tree than the standalone tool can produce.

  1. In the clustering view, select 1× view and open the cluster for which you want to reconstruct a tree from DNA evidence and MRCA data (where available).
Orange-shaded matrix of cM values between every pair of DNA matches in a cluster, used as input for kinship prediction
The input for AutoKinship: a matrix of shared cM between every pair of matches in the cluster. Each orange cell is an inter-match cM value.
  1. Click the matches pane at the top of the cluster, an AutoKinship button appears.
  2. The wizard lets you set max generations, trees per iteration, final trees to keep, plus Include known relationships and Include MRCA relationships.
  3. Before running, define known relationships manually, yourself to any match, any match to any other, generational offsets.
  4. Run. The output is a blended tree: DNA evidence fills the blanks for matches without trees, and documented genealogy provides structure where it exists.
Reconstructed tree showing a common ancestral couple at the root and pink/purple DNA match nodes at the leaves, each annotated with cM values
Reconstructed tree descending from an identified common ancestral couple, with DNA matches at the leaves and cM values on each branch.

Weighted vs. unweighted clustering

Unweighted clustering (the traditional default) treats every shared-match relationship equally. Weighted clustering considers the cM amount between shared matches, heavy-cM connections pull matches together more strongly. Weighted clustering often resolves generational depth: one weighted cluster may point to 2×-great-grandparents while a sibling cluster points to 3×-great-grandparents. Try both and compare.

UMAP scatter plot of DNA matches, with a red cluster in the upper right and an orange cluster in the lower right; a 455 cM match labelled near the orange cluster
UMAP view. A 455 cM match that unweighted clustering placed in the red group sits closer to the orange cluster in UMAP space, weighted clustering corrects the assignment.
Very large diagonal cluster chart with dozens of small clusters labelled with ancestral lines, rendered smoothly in the new large-chart mode
New large-chart mode (zoom level "Large"), stays responsive on thousand-match datasets. Each diagonal block is an annotated cluster.

Starting with specific matches of interest inside AutoLineage

As in AutoCluster, you can choose start with specific matches of interest inside AutoLineage to focus an analysis on the shared matches of a single high-cM cousin and see all the branches you share, especially powerful combined with weighted clustering.

Location analysis

AutoLineage can plot every birth, marriage and death location from a tree (or from all trees linked to a cluster) on an interactive map with an event-time slider. Useful for spotting regional clustering, migration patterns, and for comparing several trees simultaneously. Colour schemes of 2, 4 or 8 ancestral branches make geographic inheritance visible at a glance.

Top half: map of the northern Netherlands with coloured location pins clustered around small towns; bottom half: a four-colour ancestral tree aligned to the map's colour coding
Location plot (top) synchronised with a four-colour ancestral tree (bottom), every pin on the map inherits the colour of its branch in the tree.

Tips

Other tools

Alongside the main automated analyses, Genetic Affairs provides several supporting tools for situations where automated retrieval is not available or where you want to work with data you already have.

CSV file-based clustering

Any DNA testing company can be clustered manually as long as you can export two CSV files: a match list and a shared-match list. This makes the tool useful for companies that do not support automated retrieval, such as LivingDNA, or for older exports you already have saved.

Input files

File Required columns
Match list Column A: match name · Column B: cM value (optional) · Column C: notes (optional)
Shared-match list (filename must contain "shared") Column A: primary match name · Column B: shared match name

How to run

  1. Export or manually compile the two CSV files from your testing company.
  2. On the members landing page, click Run AutoCluster and choose CSV upload.
  3. Upload both files and submit.
  4. Results arrive by email as a ZIP containing an HTML cluster chart and an Excel spreadsheet.
Reference. Patricia Coleman's walkthrough for LivingDNA shows how to gather the data manually and format the CSV files: Manual AutoClusters for LivingDNA. The same approach applies to any company that lets you view shared matches.

Recluster a MyHeritage chart

MyHeritage's embedded AutoCluster sorts clusters by size, placing the largest cluster first. Reclustering reorders and regroups those clusters using the Genetic Affairs algorithm, which can reveal finer structure and bring related clusters together that the size-sorted view kept apart.

  1. In your MyHeritage account, open the AutoCluster chart and save the page as an HTML file.
  2. On the members landing page, click Recluster MyHeritage AutoClusters.
  3. Upload the saved HTML file.
  4. The reclustered chart is returned as a ZIP with an HTML chart and Excel file.
MyHeritage AutoCluster chart after reclustering, showing clusters reordered and regrouped compared to the original size-sorted output
A reclustered MyHeritage AutoCluster chart. Clusters are reordered by the Genetic Affairs algorithm rather than by cluster size, bringing related clusters closer together along the diagonal.

Transform AutoCluster HTML to Excel

Older AutoCluster analyses were delivered as HTML files only. This tool converts any existing AutoCluster HTML report into an Excel spreadsheet, making the data easier to sort, filter and annotate, without having to re-run the analysis.

  1. On the members landing page, click Transform to Excel.
  2. Upload your existing AutoCluster HTML file.
  3. Download the resulting Excel file.
Screenshot of the Transform to Excel tool showing an AutoCluster HTML report being converted to an Excel spreadsheet
The Transform to Excel tool converts an existing AutoCluster HTML report into an Excel file for easier sorting, filtering and annotation.

Glossary

AutoCluster: Tool that groups DNA matches into shared-match clusters.

AutoKinship: Tool that predicts family trees from shared DNA alone.

AutoLineage: Flagship tool. Cross-company workbench for clustering, tree import, common-ancestor detection and tree reconstruction.

AutoSegment: Tool that clusters matches by overlapping DNA segments.

AutoTree: Tool that searches match trees for common ancestors and reconstructs partial trees.

Bucketing / Family Matching: FTDNA's system for labelling matches as paternal or maternal. Once enabled, Genetic Affairs displays P/M labels in cluster visualizations.

cM (centimorgan): Unit of genetic distance measuring shared DNA segment length. Most thresholds in Genetic Affairs are expressed in cM.

Cluster: A group of DNA matches who share DNA with each other as well as with you, suggesting a shared ancestral line.

Common ancestor / MRCA: The most recent ancestor shared by two or more descendants.

Endogamy: Historical pattern where people in a closed population mostly married within it. Inflates shared-cM values and shared-match counts; requires special handling.

GEDCOM: Industry-standard file format for exporting and importing family trees.

Gephi folder: Folder inside Genetic Affairs reports with nodes.csv (matches) and edges.csv (shared-match relationships). Re-importable into AutoLineage.

ICW (In Common With): Two matches who both match you and each other.

Pile-up region: Chromosome region where many unrelated people share segments for population-level reasons. AutoSegment can filter these.

Triangulation: Three or more people sharing the same segment at the same chromosomal location, strongly implying a common ancestor. Only GEDmatch publishes triangulation data directly.

Triangulated Group (TG): Set of testers who triangulate on the same segment.

Weighted vs. unweighted clustering: In AutoLineage, weighted clustering factors in the cM between shared matches; unweighted treats every relationship equally.

Where to get help

Support channels

Further reading: blog walkthroughs and tutorials

Two independent bloggers have been documenting Genetic Affairs tools with real-world examples and step-by-step screenshots for years. Their articles are an excellent companion to this manual, especially when you want to see how a particular workflow looks end-to-end on a real match list. Posts are tagged by author below: Roberta Estes writes at DNAeXplained; Patricia Coleman writes at Patricia Coleman Genealogy. Both are trusted voices in the genetic-genealogy community and both work extensively with Genetic Affairs in their own research.

Articles are listed in chronological order within each tool so you can see how features have evolved. For a current, end-to-end walkthrough, start with Roberta Estes' February 2026 AutoKinship article: it covers the complete modern workflow from registering an FTDNA kit through to refined tree reconstruction in AutoLineage.

AutoCluster

AutoTree

AutoSegment

AutoKinship

AutoLineage

Overview & index articles