Bahmni Standard 1.0 Initializer Module for data migration

Hello, I’m exploring the idea of migrating an existing Bahmni 0.93 installation to Bahmni Standard on Docker, exploring what difficulties we could face.

Thankfully I was able to get the database to transfer nicely from 0.93 to Bahmni Standard in Docker, and now I am finding out what that Initializer module is. This module seems to reflect a fairly substantial change in the Bahmni configuration system.

When I started Bahmni with my imported database, I think at some point it started to import the OCL concepts on top of our concept dictionary, and this started to generate many errors related to duplicated concepts (probably because I didn’t realize this was happening and didn’t let the process run enough to finish the first time). But I was realizing there are many configurations in the docker container under /openmrs/data/configurations which may be undesirable for us.

Changes made by liquibase seemed “safer” because they always check preconditions. I feel uncertain about how all the data in csv’s under the configuration folder would get applied to an existing system.

Our clinic is fairly small, and I don’t think it is realistic for us to manage our configuration using the files in the initializer module - configuration of the production system will probably be updated directly in the openmrs administration console.

I’m thinking about whether it makes sense to disable the initializer module all together? I see that there is a liquibase file in configuration/liquibase/liquibase.xml. Does this mean that the initializer module is necessary for updating the imported database? Perhaps I just need to review all of the configuration in the initializer module and make a trimmed down version for our imported database?

I appreciate any input. These are just some thoughts I was having about this.

Has this been done in production yet?

If not, then the good practice would be to 1) first transition your 0.93 deployment in order for your OpenMRS distribution to be properly configured via Iniz and 2) upgrade to Bahmni 1.0.0.

You will note btw that, precisely, Bahmni eventually adopted Iniz as well (I see it being added in the Bahmni distro in July 2022).

Iniz was precisely created to walk away from managing master data on a production system directly. I’m a bit astounded that you seem to be ending up in a place where you’d want to walk this back. The smaller the clinic, the heavier is the weight of the tech debt of an OpenMRS distribution that becomes increasingly difficult to control through a proper versioning process.

As a service provider we regularly are called to clean such distributions… And there is sometimes a point where the slate has to be wiped because the cost of upgrade becomes unrealistic.

No, so far this is just a test system.

Thanks for the input. I think I need to take some time to understand the Iniz module better. Would the basic strategy be something like:

  • Begin with the ‘standard’ Bahmni configuration from upstream in the openmrs ‘configuration’ folder
  • Modify the configuration to reflect our local configuration choices
  • Place the configuration files under version control
  • For future updates, pull in configuration changes from Bahmni using something like git merge from upstream?

At our facility, we historically have not always had someone on hand to help with managing the technical side of Bahmni+OpenMRS, and quite a bit of configuration has been done using trial-and-error by some of the doctors. We have a test system for them to try things out on. I take it that changing to using the Iniz module would mean that doctors could still try things on the test system, but changes to production would have to go through IT staff, in order to make those changes using the version-controlled configuration files?

That’s exactly right :+1:

And you understood perfectly that the so-called OpenMRS configs (so those loaded by Iniz) exist to ensure that OpenMRS distributions can be versioned and that their deployment can be reproducible.

This is interesting to read, especially so about setting up a test and prod environment. Is there a difference in how you build/manage each environment? Do you push changes from test to prod, or are they run as pure standalone and you just manually replicate steps in prod once happy?

I guess we do a little bit of both.

We’ve customized some things in the bahmni frontend, and for those I have a small script on the development machines which builds the frontend and pushes it to either the test system or the production system. We try things first on the test machine, and if they work they get pushed to the production system.

For bahmni config files, OpenMRS administration and changes to concepts, normally we try things out on the test machine, and when we are happy with them we replicate the changes by hand on the production machine.

I also tried out recently putting together a script to “convert” a bahmni system between the production, test and backup system. This changes the hostname and IP, as well as changing the command line prompt and the title on the Bahmni landing page to help make it clear which system it is. That’s on hand in case the live system goes down, in order to replace it with the backup system.

I have also been using the script to convert a clone of the production system VM into a new test system.

I’m not sure if these are good ways of doing things or not, but they seem to work okay. We only had to use the backup system once after an update, and it’s easy to do the wrong thing when feeling under pressure to get things running again.

That is some really great info. All the guides I’ve looked at regarding setup don’t clearly articulate how to get it setup in that manner, so would be useful to try and document it out which I’d be happy to help with. Likewise, for the backup routine and recovering from backup. I think it’s the one hurdle, especially when you’re trying to deploy these into places where on the ground skills can’t always be taken for granted. If you can make it as simple as possible for install/manage (with the right architectural mindset) as well as backup and recovery it goes a long way.

Do you run them on physically separate machines, or logically separate?

Yes I agree, having a good backup and restore procedure helps for for setting things up in a location where your not sure who will be on the ground in case of something happening.

I think we modified our backup system quite a bit from the one included with Bahmni, and I can’t remember all the differences.

As of last year, we have two servers running a virtual environment. We used to use VirtualBox, but now switched to Proxmox. That does require learning some technical things, but works well. One server runs the backup system, and the other the live system.

Once we set up the new server I spent some time making and testing a backup and restore script, adding some logging for the backups/restores, and sending an email on a failure.

One thing I like about this setup is that our production system is scheduled to make backups daily, and the backup system is scheduled to restore from the latest backup a few hours later, so that the backup system keeps the data up to the previous day. This might be a little bit overkill, but it is nice to know that the restore script actually works since it is running regularly.

@mdg583, I will share my experience if it helps you. In our case, we performed a large scale migration of production to Bahmni standard version: 13+ GB of data, 0.93 system running on MySQL 5.7. We encountered many issues, but here are the major ones that many implementations might have in common.

MySQL 5.7 to MySQL 8.0 required fixes: MySQL 5.7 allowes 0000-00-00 by default, This required us to clean up or replace all zero-date entries in Mysql 8.0.

Program modules and concept mapping issues: The newer Bahmni Standard version uses UUIDs for concepts, while the older system relied on conceptId. This caused errors in the existing UI and program workflow until all mappings were corrected

UUID conflicts from Liquibase: Some UUIDs that are automatically created by Liquibase affected production deployments. This required new liquibase creation replacement to maintain consistency. Examples include: order.drugRoutesConceptUuid, order.durationUnitsConceptUuid, order.dosingInstructionsConceptUuid and many other

Database system permission grant error and Liquibase error while after replacing the database. provide the permission and clean the .openmrs-lib-cache

Regarding your question, “Our clinic is fairly small, and I don’t think it is realistic for us to manage our configuration using the files in the initializer module.”

I would suggest that having an initializer module is still beneficial, but the number of change requests should be minimized or maintained well. With the initializer module, even a small change requires running the entire deployment process, so keeping changes minimal reduces overhead. This way your EMR can be replicated and easy to do the maintenance. But it is upon your choice.

1 Like

That sounds really nice! Is it something you’re able to share?

If migrating to Software > Ozone HIS is a possibility (while opting for Bahmni (EMR) as your EMR app in the Ozone setup), then you could leverage this kind of goodies immediately: Backup & Restore - Docs

The scripts are not anything too advanced and are probably somewhat specific to our setup, but I can post the basic structure here. I’m not an expert at bash scripting. On the Bahmni machine there is a backup script called backup_master which looks like this:

#!/bin/bash

email="..."
backup_folder="..."

append_output() {
    echo "$1"
    output=$(printf "%s\n%s" "$output" "$1");
}

error_email() {
    datestring=`date +%Y-%m-%d`
    timestring=`date +%H:%M`
    printf "%s" "$output" | {
        echo "An error occurred during backup of $(hostname) on $datestring at $timestring."
        echo ""
        echo "The output generated was:"
        cat -
    } | sed 's/\r//' | mail -s "Backup Error Notification for $(hostname)" "$email"
}

success_email() {
    datestring=`date +%Y-%m-%d`
    timestring=`date +%H:%M`
    printf "%s" "$output" | {
        echo "Backup succeeded for $(hostname) on $datestring at $timestring."
        echo ""
        echo "The output generated was:"
        cat -
    } | sed 's/\r//' | mail -s "Backup Success Notification for $(hostname)" "$email"
}

backup_log_failed() {
    # Code that was aided by chatgpt. This takes $output, indents using a tab character,
    # and then appends it to the echo statement below. Then it writes it to backup.log
    printf "%s\n" "$output" | sed 's/^/\t/' | {
        echo "$backupdate $backuptime - backup failed. The output generated was:"
        cat -
    } >> "$backup_folder/backup.log"
}

backup_log_succeeded() {
    echo "$backupdate $backuptime - backup completed at $(date +%H:%M:%S)" >> "$backup_folder/backup.log"
}

output=""
backupdate=$(date +%Y-%m-%d)
backuptime=$(date +%H:%M:%S)

append_output "Backup $backupdate - `hostname`"
append_output "Backup started at $backuptime"

backup1_out=$("$backup_folder/backup1.sh" all)
backup1_exit=$?
append_output "$backup1_out"
if [ $backup1_exit -ne 0 ]; then
    backup_log_failed
    error_email
    exit 1
else
        append_output "Connecting to backup server at $(date +%H:%M:%S)"
        backup2_out=$("$backup_folder/backup2.sh")
        backup2_exit=$?
        append_output "$backup2_out"
        if [ $backup2_exit -ne 0 ]; then
            backup_log_failed
            error_email
            exit 1
        fi
fi

append_output "Connecting to synology server at $(date +%H:%M:%S)"
backup3_out=$("$backup_folder/backup3.sh")
backup3_exit=$?
append_output "$backup3_out"
if [ $backup3_exit -ne 0 ]; then
    backup_log_failed
    error_email
    exit 1
fi

append_output "Backup finished at $(date +%H:%M:%S)"
backup_log_succeeded
# success_email

echo "$output" > "$backup_folder/backup_last.log"

The backup is done in three steps. backup1.sh generates the dump files of the databases (shutting down and starting up bahmni). backup2.sh connects to a remote backup server and transfers the backup files. backup3.sh uses rsync to replicate the patient documents (in /home/bamni) onto another backup server location.

Each of those scripts generates output about what is going on, and the exit code indicates if the script succeeded. Then the master script checks the exit code of each script, and if it failed it dumps the output to the backup log and also sends it by email. If it succeeds, it just puts in the log the start and end time of the backup. In either case, it writes the full output of the backup to a file backup_last.log.

1 Like

The restore scripts are similar in that they write a single line to the log if the restore succeeds, and otherwise write the full output to the log and email it if it failed. But the process is a bit more complicated because the backup has to be selected. I will post some of it here.

Here are the restore steps in the master restore script (this is of course not ready to run as is):

output=""
restoredate=$(date +%Y-%m-%d)
restoretime=$(date +%H:%M:%S)

append_output "Restore $restoredate - `hostname`"
append_output "Restore started at $restoretime"

cmd_out=$(stop_bahmni)
cmd_res=$?
append_output "$cmd_out"
if [ $cmd_res -ne 0 ]; then
    restore_log_failed
    error_email
    exit 1
fi

append_output "Restoring Bahmni Databases"
cmd_out=$(mount_backups)
cmd_res=$?
append_output "$cmd_out"
if [ $cmd_res -ne 0 ]; then
    unmount_backups
    restore_log_failed
    error_email
    exit 1
fi

cmd_out=$(select_backup "$DBBACKUP_MOUNT_PATH")
cmd_res=$?
if [ $cmd_res -ne 0 ]; then
    append_output "$cmd_out"
    unmount_backups
    restore_log_failed
    error_email
    exit 1
fi
bdate=$cmd_out
append_output "Selected backup folder: \033[34m$bdate\033[0m"

# Restore databases
start_db_restore=$SECONDS

cmd_out=$(yes | ./restore_databases.sh -d "$DBBACKUP_MOUNT_PATH"/"$bdate")
cmd_res=$?
append_output "$cmd_out"
if [ $cmd_res -ne 0 ]; then
    unmount_backups
    restore_log_failed
    error_email
    exit 1
fi
end_db_restore=$SECONDS
restoretime2=$((b-a))

append_output "DB Restore took $restoretime2 seconds"

cmd_out=$(unmount_backups)
cmd_res=$?
append_output "$cmd_out"
if [ $cmd_res -ne 0 ]; then
    restore_log_failed
    error_email
    exit 1
fi

append_output "Restoring Patient Files"
cmd_out=$(./restore_files.sh)
cmd_res=$?
append_output "$cmd_out"
if [ $cmd_res -ne 0 ]; then
    restore_log_failed
    error_email
    exit 1
fi

cmd_out=$(restart_bahmni)
cmd_res=$?
append_output "$cmd_out"
if [ $cmd_res -ne 0 ]; then
    restore_log_failed
    error_email
    exit 1
fi

append_output "Restore finished at $(date +%H:%M:%S)"
restore_log_succeeded

echo "$output" > "$RESTORE_FOLDER/restore_last.log"

The code to select a backup folder to restore from is this:

#!/bin/bash

# Extract date from folder name
extract_date() {
    if [[ "$1" =~ (^|[^0-9])([0-9]{8})([^0-9]|$) ]]; then
        date_str="${BASH_REMATCH[2]}"
        if date -d "$date_str" "+%Y%m%d" >/dev/null 2>&1; then
            echo "$date_str"
        fi
    fi
}

get_folder_dates() {
    local dir="$1"
    if [ ! -d "$dir" ]; then
        echo "Directory not found: $dir" >&2
        return 1
    fi
    # Get list of folders
    while IFS= read -r -d '' path; do
        folder="$(basename "$path")"
        date_str=$(extract_date "$folder")
        if [[ -n "$date_str" ]]; then
            folder_dates["$folder"]="$date_str"
        fi
    done < <(find "$dir" -mindepth 1 -maxdepth 1 -type d -print0) # pipe find results into read
}

select_backup() {
    # Usage info
    if [ $# -lt 1 ]; then
        echo "Find the most recent backup date in a folder, or find the backup date matching a specific date"
        echo "Usage: $0 <directory> [specific_date]"
        return 1
    fi

    directory="$1"
    specific_date="$2"
    declare -A folder_dates # Associative array

    get_folder_dates "$directory"

    # exit if no valid folders
    if [ ${#folder_dates[@]} -eq 0 ]; then
        echo "No valid backup folder"
        return 1
    fi

    if [ -n "$specific_date" ]; then
        for folder in "${!folder_dates[@]}"; do
            if [ "${folder_dates[$folder]}" == "$specific_date" ]; then
                echo "$folder"
                return 0
            fi
        done
        return 1
    fi

    # Find folder with max date
    latest_folder=""
    latest_date="00000000"
    for folder in "${!folder_dates[@]}"; do
        current_date="${folder_dates[$folder]}"
        if [[ "$current_date" > "$latest_date" ]]; then
            latest_date="$current_date"
            latest_folder="$folder"
        fi
    done

    if [ -n "$latest_folder" ]; then
        echo "$latest_folder"
        return 0
    else
        return 1
    fi
}

I’m not sure how useful this is without taking the time to understand it and put it together! It could make sense to try to make the process more robust and generic, but I’m not sure how easy that would be to do. Maybe I will do that sometime and post the whole thing to github. The solution posted by @mksd would be an easy option.

1 Like

@deepakneupane Thanks a lot, it’s very helpful to see the notes about the upgrade process. We saw the issue about the 00:00:00 dates, and I was hoping that updating the Java tz data (using tzupdater) and setting the TZ variable to a timezone without DST would fix this, but I’m not sure if it really has or not. I will look into the other points. It’s also helpful to have the input about whether to use the initializer module configuration.

I’m confused, I’ve had a look at https://openmrs.atlassian.net/wiki/spaces/docs/pages/25477009/Ozone+-+Health+Information+System and Ozone HIS seems to do the same as what Bahmni does (e.g., all the functions including ERP for stock control etc.). If you’re already using Bahmni, why would you migrate to Ozone? Genuinely curious! Apologies if this is deemed to hijack the thread. I can start a new one if that’s preferred?

Did you read this? :backhand_index_pointing_right: Understanding the Differences Between Bahmni and Ozone HIS

I have now, yes, that’s cleared things up.