Source language: Translate to:

Removing duplicate records in a database

Questions or information that don't quite fit anywhere else

Moderator: Neosoft Support

Removing duplicate records in a database

Postby beno » Mon May 23, 2005 5:41 pm

Hi,

Well my question is not about NB, is about DBs but I put it here, maybe someone can hekp me.

I have a NBDB with 11,000 records. Different persons use it and I'm affraid now it has duplicate records.

This database is a medical app wich stores patients records. Each record is for a particular patient. We have its name, sex, age, history, evolution, etc.

I know that if I sort the name field, duplicates will appear together... but if I have 11,000 it is going to be a bad idea to try to check the list manually.

There is no key for each record, just the fields described..

Mmmm, If I try to compare each one with the rest... this is going to be real slow...

Maybe I can compare each person name with the records that only starts with the same letter... as I type this some ideas are comming to my neurons...

Thanks for your time...

Saludos

beno
User avatar
beno
 
Posts: 678
Joined: Fri Apr 01, 2005 9:03 am
Location: México

Postby Leos » Mon May 23, 2005 6:43 pm

hi Beno,

My suggestion:

in this case i use dbfquery command

....
dbfQuery "sample.dbf" "Name==Beno AND cidade==mexico"

if [dbfQueryResult]>0
... you have great chances that it is a duplicated regist

so, use

dbfDeleteRecord
dbfshowall
dbfNext


you can do this procedure with a 'loop', or 'while'

I think this is not to slow with 11.000 records (i do this with a database with over 70.000 records, without problems... (just about one hour)

THIS is one more useful command that i suggest to include in the next update from this GREAT plugin

...find duplicated regists
User avatar
Leos
 
Posts: 178
Joined: Mon Apr 04, 2005 11:13 am
Location: Coimbra, Portugal

Postby Gaev » Mon May 23, 2005 6:47 pm

beno:
If I sort the name field, duplicates will appear together... If I try to compare each one with the rest... this is going to be real slow...
You only have to compare each current record with the previous (not all the rest) ... something like this ...

dbfSort "myDB" ...
SetVar "[Previous.Name]" ""
SetVar "[Previous.Age]" ""

Loop "1" "[dbfRecordCount]" "[counter]"

If "[myDB.Name]" "=" "[Previous.Name]"
If "[myDB.Age]" "=" "[Previous.Age]"
... same as previous record ... delete it ?
EndIf
EndIf

SetVar "[Previous.Name]" "[myDB.Name]"
SetVar "[Previous.Age]" "[myDB.Age]"
dbfNext "myDB"

EndLoop

... the key question is going to be which of the duplicate records should be deleted ? or are they 100% identical ?

If looping through 11,000 records takes too long ... the only thing I can suggest is ... a faster (make that super fast) machine.
User avatar
Gaev
 
Posts: 3738
Joined: Fri Apr 01, 2005 7:48 am
Location: Toronto, Canada

Postby beno » Mon May 23, 2005 7:06 pm

Gaev,

You are right!

Once sorted the DB I have only to compare with previous and next.

Possibly do a loop if there are many duplicates.

Thanks a lot,

beno

PS thanks for your time too Leos!
User avatar
beno
 
Posts: 678
Joined: Fri Apr 01, 2005 9:03 am
Location: México

Postby Sam Cox » Tue May 24, 2005 6:12 am

Hi beno,

Gaev's solution is a good one -- and it's now in my notebook!

Can you improve your medical record database program so that continued use will not result in more duplicate entries?

-- Sam
User avatar
Sam Cox
 
Posts: 768
Joined: Fri Apr 01, 2005 7:30 am
Location: Loveland CO USA

Postby beno » Tue May 24, 2005 6:56 am

Hi Sam,

Good idea, now I think I can add your idea: check the existence of records using the patient's name before adding it to the database...

Thanks for your help,

beno
User avatar
beno
 
Posts: 678
Joined: Fri Apr 01, 2005 9:03 am
Location: México

Postby cp4w » Thu Aug 18, 2005 4:10 pm

Have not the patients a personal code, like fiscal code or medical public service code?

Bye
Licia
cp4w
 
Posts: 533
Joined: Sun Apr 03, 2005 4:37 pm
Location: Great Neck, NY


Return to Misc. Questions and Information

Who is online

Users browsing this forum: No registered users and 2 guests

cron