PeopleFinderTechStructuredDataSets

From Katrina Help Info

Table of contents

Red Cross (ICRC)

http://www.familylinks.icrc.org/katrina

Records: 134,000+

Status: Scraped and verified against schema. Working on splitting into multiple smaller files to upload.

Contact: op_prot_eur.gva [at] icrc.org

Notes:

  • I have sent e-mail to this contact address -- JamesDennett
  • Since Bill has run out of time to work on this scraping project I'll take a crack at it. I can be contacted via email at brent [at] bjohnson.net
  • I have modified the scraper to handle deltas and to parse first and last names. I ran the scraper last night and collected 94320 records. Unfortunately it hit an unexpected page and stopped. I'm rerunning the scrape and picking up the names that have been added since the last scrape and the ones after it stopped. --Bljohnson 14:13, 7 Sep 2005 (EDT)
  • Scrape completed -- Brent

Gulf Coast News Survivor Connector

http://wx.gulfcoastnews.com/katrina/status.aspx

Records: 58,931

Status: Validated.

Contact: devin at nacredata.com

Contact: ken [at] gulfcoastnews.com

Notes: Emailed requesting participation on Tues 9/6 3:19pm

Katrina Data Project

http://www.katrinadataproject.com/index.aspx

Records: 33,743

Status: David G. has talked with them, and they're willing to work with us. We need someone to help them with PFIF.

MSNBC "Looking for" and "Safe" lists

http://www.msnbc.msn.com/id/9159961/ (Looking for)

http://www.msnbc.msn.com/id/9159954/ (Safe)

http://www.msnbc.msn.com/apps/connect/search.aspx/ (Search form)

Records: 150,000+

Status: Currently scraping.

Contact:

Notes:

  • I've started scraping the "Looking for" list now.

Public People Locator

http://www.publicpeoplelocator.com/

Records: 32,755

Status: some duplication in data, but fairly complete and orderly.

Contact: katrinapeoplefinder [at] yahoo.com

Notes: Emailed for participation on Tuesday 9/6 11:13pm PST :: AaronPava 02:15, 7 Sep 2005 (EDT)

Family Messages

http://www.familymessages.org/index.php

Records: 15,922

Status: Willing to help. Developer is on katrina dev mailinglist.

Contact: chaney [at ] dcre-labs.com

Note: Implementing PFIF. Test feed is at http://familymessages.yahoo.net/new/pfif.php?page=1

Emailed for status Tuesday 9/6 2:23PM PST

Response: (2:31 PST) Thanks Aaron, an extra set of eyes is always helpful.

Please take a look at (removed for privacy) for the "in test" examples. We have feedback concerning the UTC timestamp, xml headers and the need fpr a pfif: tag at the beginning. That is being worked now but if you see anything else amiss, please don't hesitate.

-dan

Katrina Safe

http://www.katrinasafe.com/WebEntryApplication/searchform.aspx

Records: Maybe 5,000 to 20,000 (There are 95 Smiths, and Smiths are about 1% of the US population.)

Status: not browsable; will need cooperation from site

Contact: katrinasafeweb [at] hotmail.com

Notes: Emailed for participation on Tuesday 9/6 11:16pm PST :: AaronPava 02:19, 7 Sep 2005 (EDT)

Hurricane Katrina Survivor Registry

http://www.katrina-survivor.com/

Records: 13,987

Status: not browsable; will need cooperation from site.

             i think this is browsable.  try searching for '%':
             http://www.katrina-survivor.com/searchbyname.php?FirstName=%25&MiddleName=&LastName=
               i'm starting to scrape it now ZBerke 01:44AM, 8 Sep 2005 (PST)

Contact: gtg944q [at] mail.gatech.edu (Justin Harper)

Notes: Emailed for participation on Tuesday 9/6 11:20pm :: AaronPava 02:21, 7 Sep 2005 (EDT)

LANH Katrina Evacuee Directory

http://www.lnha.org/katrina/default.asp

Records: 4,500 (roughly)

Status: Added to list Sept 7, 10:30PM PST Appears on one page in entirety: search by last name '%'.

Contact: info[at]lnha.org (has not been contacted)

Katrina Finder

http://www.katrinafinder.us

Records: 4,223

Status: Helping out. Implementing RSS spec. Developers on mailinglist.

Contact: dan[at]katrinafinder.us

Notes: emailed for status Tues 9/6 2:30 PST

Katrina Tracker

http://www.katrinatracker.com

Records: 3,052

Status: Up for helping, developer is not on katrina dev mailinglist though.

Contact: help[at]katrinatracker.com (Paul)

Notes: emailed for status Tues 9/6 2:34 PST

I contacted Paul on 9/13. He is going to send me example data to get started. - Geoff Webb

Response: spoke at 3:25 by phone. Open to participate. Will get on mailing list.

Hurricane Help

http://katrina.earthlink.net/people/list

Records: 2,925

Status: Helping out. Implementing RSS spec. Developers on mailinglist.

Contact: holland3 [at] corp.earthlink.net

Note: Implementing PFIF feed

Emailed for status Tuesday 9/6 2:26PM PST

Houma Shelters

http://www.houmashelters.com/

Records: 2,800

Status: The official webiste for evacuee shelters in the Houma/Terrebonne Parish area. I (the webmaster) am working on a PFIF export.

Contact: webmaster [at] houmashelters.com / matthew [at] phusikos.com

Notes: emailed for status at 2:15 PST 9/6

'Response: AaronPava 16:37, 8 Sep 2005 (EDT)

Hi Aaron,

Sorry for not getting back to you earlier. I just got back to work yesterday and my plate is all-too-full. It would be great if you or other developers could assist. I began work on it, but I'm just not sure if I'll have the time to finish up. I can send you the work-in- progress, the database schema we're using, and any other information that might be of help.

Thank you, Matthew

Hurricane Katrina Persons DB

http://connect.castpost.com/fulllist.php

Records: 2,290

Status: Scraped,

Contact: katrina [at] castpost.com

Notes: Emailed for participation on Tuesday 9/6 11:23pm PST :: AaronPava 02:24, 7 Sep 2005 (EDT)

Validation: Attempted to validate XML file against http://www.w3.org/2001/03/webdata/xsv and recieved the following error:

The following tags were not closed: xsvHardFault. Error processing resource 'http://www.w3.org/2001/03/webdata/xsv?docAddrs...


Find Katrina

http://www.findkatrina.com/

Records: 2,580, but fewer after garbage is cleaned up

Status: first attempt posted at [findkatrina.rss (http://www.dwiggins.net/katrinadev/findkatrina.rss)]

Contact: alexkehr [at] mac.com

(I have sent e-mail to this contact address -- JamesDennett (jdennett).)

Notes: Sent email to see about participation on Tues 9/6 3:57pm PST

Scraping Update: Open questions about this feed:

  1. Is it ok to leave the source date field blank if the original repository doesn't provide an entry date for the record?
  2. The original source just has a combined "contact" field, rather than a seperate field for phone and e-mail. Briefly considered trying to grok out e-mail addresses, etc. But this would be pretty unreliable given the dirtiness of the data. So I just duplicated the data into both, figuring in this case it would improve searchability. Is this the right thing to do?
  3. Similarly, the original source does not seperate first and last names. So people have done them in all sorts of combinations. Rather than trying to interpret this and risk getting it wrong, I just put the whole string in last name. Is this ok?
  4. Ditto on address --- there is just one "lives in" field, so the data is pretty dirty. Just dumping it all in home_city. In addition, I am attempting to grok out the state (based on multiple choice of AL, LA, or MS) and fill it in automatically. If I can't match it to one of these states I'm leaving this field blank.

Appreciate any feedback! --Dmdwiggi 13:25, 8 Sep 2005 (EDT)

Katrina Survivor

http://www.katrinasurvivor.net/find.cfm?PageNum_GetAll=1&sort=name

Records: 2,151 posts

Status: need to scrape

Contact: webmaster [at] katrinasurvivor.net (Joe Bykowski)

Notes: Emailed for participation on Tues 9/6 4:21pm PST

Response:

Aaron:

I truly appreciate your offer to include KatrinaSurvivor.net's databases in a unified survivor database through a PFIF feed. At this time, however, KatrinaSurvivor.net will be unable to participate in any such project.

I wish you the best of luck with this effort.

Joe Bykowski

Validation: Passed with the following messages:

Schema validating with XSV 2.10-1 of 2005/04/22 13:10:49

  • Target:
    • Real name:
    • Length: 2731730 bytes
    • Last Modified: Thu, 08 Sep 2005 13:15:39 GMT
    • Server: Apache/2.0.54 (Debian GNU/Linux) mod_jk2/2.0.4 PHP/4.3.11-0.dotdeb.1
  • docElt: {http://zesty.ca/pfif/1.1}pfif
  • Validation was strict, starting with type [Anonymous]
  • schemaLocs: http://zesty.ca/pfif/1.1 -> http://zesty.ca/pfif/1.1/pfif-1.1.xsd
  • The schema(s) used for schema-validation had no errors
  • instanceAssessed: true
  • No schema-validity problems were found in the target

Geeklibrarian 18:17, 8 Sep 2005 (EDT)

Hurricane Refugees

http://www.hurricanerefugee.com/names.asp

Records: 2,129

Status: YES!

Contact: content[at]hurricanerefugee.com / Greg VanDell egvandell [at] hotmail.com

Notes: Emailed for participation on Tues 9/6 4:32PM PST

Response: Please send any ASP code if available. If you only have PHP thats fine, I can just convert it. thx, -g

I can create an XML feed for you guys. I've been attempting to get in touch with the Red Cross to coordinate these efforts, but haven't had any success with them yet. I'm getting an large amount of hits (already received a quarter million), we could use this site as a portal. I'm running SQL Server and ASP...don't know what you guys are on.

Have you been able to link any of the other sites yet? Let me know if you guys want something more than an XML feed.

Thanks,

Greg Van Dell

Forest Hills, NY

Response2: (AaronPava 13:42, 7 Sep 2005 (EDT))

I have a realtime PFIF feed - for access please email me at -- content [at] hurricanerefugee.com

Hurricane Katrina Missing List

http://www.gwid.com/katrina.php

Records: 1,669

Status: Willing to help, Developer is on mailinglist.

Contact: kirk [at] gwid.com

Notes: emailed for status Tues 9/6 2:32 PST

Response: Yeah any help you can give will be helpful. I have been very busy trying to find a job in Montgomery AL. The database is MySQL with a PHP Front End.

Response2: Sure but give me a day I will have a new interface up that allows for more options. I am working with someone now on developing some search features along with photo postings. Thanks

Response3: I am forwarding the contact info with I guy that I am working with out of Cal. My link will be forwarded to his. His front end is better than mine and we will continue to upgrade it as necessary. He can help with setting up PFIF.

nik [at ]monkeymaximus.com his name is nik

thanks, kirk

Tulane Safe Registry

http://www.scribedesigns.com/tulane/

Records: 1,502

Status: needs help implementing

Contact: Harley Robertson frontdoor2[at]scribedesigns.com IM:sparrowhawk12345

Notes: Emailed for participation on Tuesday 9/6 11:25pm PST :: AaronPava 02:26, 7 Sep 2005 (EDT)

Response: contacted me by IM at 11:40pm PST :: AaronPava

Response2: K, I'm all for providing a feed - just dun know how.

It shouldn't be hard, we just have a simple MySQL DB - I kinda cheated and just added a table to a database with other stuff to get it up faster though, so I would like to work on whatever I need to work on myself - I'd just like some sample scripts, or whatever is involved, and I'll customize and install them. My ICQ is : 20345371 Y!,AIM,MSN : sparrowhawk12345

-Harley Robertson

Notes: Scraped at Google, (9/15) where they found 1933 records. Scraped XML uploaded to WIKI. Contact: pasztor at gmail dot com


InfoZone New Orleans Missing

http://www.theinfozone.net/NOLAmissing2.html

Records: 2962 (10/30 14:30PM EST)

Status: Added to list on 9/7 10:30PM PST

Contact: katrina[at]theinfozone.net (has been contacted)

Harrison County Missing/Inquired About Persons

http://co.harrison.ms.us/assistance/missing/

Records: 1132

Status: Needs to be scraped

Contact: webmaster@co.harrison.ms.us

Note: Contains only list of people who have been asked about. There's a list of confirmed fatalities at http://co.harrison.ms.us/assistance/confirmed/

CNN Safe List

http://www.cnn.com/SPECIALS/2005/hurricanes/list/

Records: 1,120

Status: Scraped,

Contact: hurricanevictims[at]cnn.com

Notes: Send email for participation request on Tues 9/6 4:04pm

Missing Katrina

http://callhome.textamerica.com/

Records: 669

Status: might be difficult; photos are good, but data seems limited

Contact: callhome.123 [at] tamw.com

Notes: Sent an email requesting participation Oasisbob 02:59, 7 Sep 2005 (EDT)

Hurrican Survivors.org

http://www.hurricanesurvivors.org/database.html

Records: 596

Status: Willing to help.

Contact: valenkim [at] hotmail.com

Notes: emailed for status Tues 9/6 2:37 PST

Katrina Survivor Database

http://katrina.streetlampsoftware.com/

Records: 456

Status: Being scraped by Gabe Wachob (gwachob@wachob.com)

Contact: katrina [at] streetlampsoftware.com

Notes: Emailed to see about PFIF participation on Tuesday 9/6 2:53 PST

NCMEC Hurricane Katrina Children

http://www.missingkids.com/missingkids/servlet/PageServlet?LanguageCountry=en_US&PageId=2077

Excel spreadsheet, direct link is http://www.missingkids.com/en_US/documents/katrina.xls

Records: 334

Status: Devin waiting for info on PFIF id from Ping

Contact: "Contact Us" link times out on their site. Oasisbob

emergency-database

http://www.emergency-database.com/guide/

Records: 200 (does it really even have this many? -- wayward)

Status: ?

Contact: INFO[at]EMERGENCY-DATABASE.COM

Notes: Emailed requesting participation on Wed 9/6/05 at 7PM Seems less structured that it appears

Survivor Registry

http://www.survivorregistry.com/cgi-bin/show_all.pl

Records: 193

Status: Sent an email inviting participation. Oasisbob 03:06, 7 Sep 2005 (EDT)

Contact: survivorregistry [at] gmail.com

Notes: "Message" field has name of found, names of missing; rather unstructured data

Search for Missing People

http://www.searchformissing.org/

Records: 80

Status: ?

Contact: leepaulmartin [at] gmail.com

Notes:How do you get data out of Google Maps AJAX ??

Looks like there are 80 points in the map and 92 rows in the table (which contains only the names of the searcher and the missing person, so may not be useful). I think the person who entered 184 as the number of records was counting each name in the table as a record rather than each row. The data for the 80 Google map points is actually in the JavaScript in the HTML page and could be parsed out of there. It's just names and addresses, though. --KCIvey 09:05, 7 Sep 2005 (EDT)

I will work on this tonight. Please email me at mmondok@clariondata.com if someone beats me to it. I am looking at putting out an XML feed once it is done. For now, note that the AJAX will just spit out structured, client-side script in the source that can be parsed with regex. --Mmondok 12:06, 7 Sep 2005 (EDT)

It doesn't look like this site is actually using AJAX -- there is no XMLHttpRequest object or any derived object. Look at the JavaScript line that starts: "html = "Lozano, Leonard
4646 Demontlizan
";var point =..." It looks like each marker is created with a call to the createMarker function, passing lat, long, and the info. So you can just parse up that line, or copy the javascript, then modify the createMarker function to do something more useful, like a document.writeln in CSV format, then you could just copy off your webbrowser. Let me know if you have any other questions about the Google Maps API or AJAX. -- ZBerke 10:23, 7 Sep 2005 (PST)

I worked on this last night. The map itself implements AJAX but the people are static, which makes things much easier. I am hoping to finish the scraping today. --Mmondok 08:23, 8 Sep 2005 (EDT)

The site is scraped and I have the people, but I am trying to match up the people searching for them. I have to tend to family matters this weekend, but I will make available what I have. --Mmondok 03:06, 9 Sep 2005 (EDT)

Find Our Family

Records: ?

Status: ?

Contact: Yorweb.com Inc, 3256 Yates St, Bartlett TN 38134, Ph. Local 381 1715, 1-901-381-1715

Notes: Referenced in the Memphis Commercial Appeal as website set up by local company.

Operation Kare

http://www.kare.arkansas.gov/

Records: 23,000+

Status: Excel spreadsheet available from site; conversion to PFIF in progress.

Contact: Kay A. Doggett, 607 Belvue Road, Travelers Rest, SC 29690, work: 864-573-1643, cell 864-982-5396, < debugdiva @ gmail.com >

SafeKatrina.com

Records: 120?

Status: being scrapped

Contact: joe at gmail dot com

Notes:

wecaretexas

Records: 216172

Status: Scraped at Google, validated, uploaded to WIKI.

Contact: pasztor at gmail dot com

Help us stay online!