1 / 20

What is necessary (and unnecessary) for analyses of offender databases

What is necessary (and unnecessary) for analyses of offender databases. Jason R. Gilder August 16, 2008. Forensic Bioinformatics (www.bioforensics.com) gilder@bioforensics.com. Offender databases. Originally designed for convicted offenders

hallam
Download Presentation

What is necessary (and unnecessary) for analyses of offender databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What is necessary (and unnecessary) for analyses of offender databases Jason R. Gilder August 16, 2008 Forensic Bioinformatics (www.bioforensics.com) gilder@bioforensics.com

  2. Offender databases • Originally designed for convicted offenders • CODIS: Convicted Offender DNA Index System • Expanded • Unsolved crime samples • Arrestees • Elimination profiles

  3. CODIS • COmbined DNA Index System • National: NDIS • State: SDIS - fewer restrictions • Local: LDIS - fewest restrictions • Convicted Offender Profiles in NDIS: 6,031,000 • Forensic Profiles in NDIS: 225,400 • More than 71,800 cold hits

  4. Why analyze a database? • Questions remain regarding the weight of a DNA database match • Random Match Probability (RMP) • Database Match Probability (DMP) • Balding & Donnelly LR • Other • Composition of database may affect chance of a coincidental match • Presence of relatives

  5. Structure of a DNA database • Collection of records • Structured Query Language (SQL) format

  6. Examples of possible issues with the use of DNA databases • Michigan v. Gary Leiterman • Evidence: blood found on victim’s hand • Cold hit to a 4-year-old boy • R v. Sean Hoey • Evidence: explosive device • Cold hit to a 14-year-old boy • Jaidyn Leskie inquest (Australia) • Evidence: clothing from deceased • Cold hit to a rape victim

  7. Lab error and false cold hits

  8. How a database can be analyzed • Perform all pairwise profile comparisons • the “Arizona Search” • P1 with P2, P1 with P3, P1 with P4, …, P1 with Pn • P2 with P3, P2 with P4, P2 with P5, …, P2 with Pn • Analyze profile similarity • Count number of matching loci and alleles • Perform kinship analyses

  9. Arizona Match Data • 65,493 Profiles • 122 pairs matched at 9 of 13 loci • 20 pairs matched at 10 of 13 • 1 pair matched at 11 of 13 • 1 pair matched at 12 of 13

  10. Review of Victoria State Database Krane/Paoletti analysis: >11,000 profiles each compared to all others across 9 loci: Shared allelesObserved occurrences 14401 1527 161 1716 18 0 AussieBump

  11. 300 100 20 1

  12. Issues with the release or analysis of a DNA database • Privacy concerns • Names, social security numbers, DNA profiles, addresses, etc. • Issues with analysis • Duplicate profiles, multiple databases, presence of relatives, processing time, CODIS requirements • Legal issues • California Proposition 69

  13. Issue 1: Privacy concerns • Database contains private information that should not be released • Answer: provide anonymous profiles only • Accomplished through one command • SELECT D3, vWA, FGA, …, D7 FROM CODIS_DB

  14. Issue 2: Duplicate profiles • Many databases contain at least 10-15% duplicate profiles • Answer: ignore duplicates in analysis • A fairly thorough database analysis can take place with duplicates removed • Also identify potential mistyping rate • The lab may be able to cull out duplicates from the same individual with additional information (e.g. SSN)

  15. Issue 2b: Multiple databases • California DOJ contains information in two databases that can be cross referenced to remove duplicates • Login DB – contains unique “CII” ID and accession numbers of all samples for that individual • SDIS – contains accession number and profile • Answer: JOIN the data with one command • Only select the first accession number profile • SELECT D3, vWA, FGA, … D7 FROM SDIS JOIN LOGIN_DB WHERE (LOGIN_DB.ACCESSION1 = SDIS.ACCESSION)

  16. Issue 3: Presence of relatives • It is difficult to identify the presence of relatives by hand by simply looking at the CODIS records • “There are a significant, but unknown number, of such related individuals in California’s offender database.” – Kenneth Konzack • Answer: Exactly!

  17. Issue 4: Processing time • Performing an internal search of the database will take too long (a week or more) and will not allow for CODIS searches during that time • Answer: perform an analysis on a separate computer or computers • Pairwise database search is “embarrassingly parallel”

  18. Issue 5: Legal issues • Legal statutes (e.g., California Proposition 69) prohibit release of database to citizens • Answer: 38 state statutes (including CA) allow for an outside review of their database for statistical analysis • Many require the removal of identifying information

  19. Questions?

More Related