Trying to build a comprehensive address hierarchy for India

Hi,

I am trying to find the best source for all Taluks, Districts, States etc in India to build a comprehensive address database as given in this link. https://bahmni.atlassian.net/wiki/display/BAH/Address+Hierarchy

Do we already have it somewhere? If not, I found this source or is there a better starting point? http://www.indiastudychannel.com/India/states

Thanks again for any info,

best, Ashok

Hello @ashokraman,

Now we have not ready list of address hierarchy for India we are working on it to provide CSV of address hierarchy.

For the starting point we can refer address hierarchy from the following source. (http://censusindia.gov.in/2011census/Listofvillagesandtowns.aspx)

Thanks, Amol.

Thanks @amol. Yes, we too found this and extracted all and have created various states based csv. Can share it for you all to use, if it will help.

In fact we are running into an issue of uploading larger files (AP, UP, Bihar etc - UP is 4.57MB). Any suggestions? Do we split it up?

I am not able to upload a sample file here, unfortunately.

Thanks, Ashok

@ashokraman: Were you able to workaround this problem?

Can you maybe paste the file in http://pastebin.com and share the link, or put in on google drive and share a google drive link?

I used GSplit.exe to split the files for certain records - I think tried 10K lines, that too was too big. I have to reduce it further.

The gdrive link is https://drive.google.com/file/d/0B_h7poErLnBjNDMtN0otcDcyWkE/view?usp=sharing

Thanks, Ashok

7000 records loaded fine, Thanks,

So with inconsistent behavior of loading various size of state files through the openmrs interface, I wrote a python script to directly load the address_hierarchy_entry. Now there are 652516 records from 35 state files but this is what I get from openmrs - "Proxy Error The proxy server received an invalid response from an upstream server. The proxy server could not handle the request GET /openmrs/module/addresshierarchy/admin/manageAddressHierarchy.form.

Reason: Error reading from remote server " So it still doesn’t work - should we disable this altogether so the user can enter whatever village, tehsil, district and state without entries in address_hierarchy_entry or restrict it to just village/town/city, district, state ?

Thanks for any suggestions, Ashok

Can you please also attach your openmrs log file for reference?

Please can you share the address hierarchy csv and the Python script you are using, we shall investigate this further.

Just sending you the directory with all the states files as well as python files I used to load the table.

https://drive.google.com/open?id=0B_h7poErLnBjdGloOEtGWXdISDg

On windows, python 3 and oracle virtual box.

Thanks for your help, Ashok

1 Like

From your code, if I am right, it is apparent that you are loading data into the openMRS database directly. It would be interesting to know why the exception thrown indicates “/openmrs/module/addresshierarchy/admin/manageAddressHierarchy.form”. Please can you share the execution steps.

Yes, that is right. The idea was to reduce one hop by directly loading into the database. I was then refreshing the openmrs address hierarchy page to see if openmrs saw the hierarchy the way I expected it to. With only a few states loaded, I could see this via the openmrs upload form. So I am still unsure if my entire approach is right. I just wanted to get the basics working so the users could add a range of historical clinical forms from various parts of India. I am trying out a slightly different approach, will update here if it works,

Thanks, Ashok

If I have understood you correctly, you are referring to data migration. To the best of my knowledge I don’t think the system does a validation to check if the address entered is a valid one or is present in the address hierarchy. Address Hierarchy is useful more in terms of auto populating other fields such as District, State etc., when we enter a Village that is not listed it permits free text on all the address fields.

We are also analyzing the issue reported in parallel, but wanted to inform you that data migration should not have dependency on the address hierarchy table. As person_address actually stores the address itself and not the link to the address_hierarchy_entry table.