Please feel free to give your comments and review about my progress and this presentation
Coming together is a beginning, keeping together is progress, working together is success.
Short description: The amount of data generated is getting increased day by day and so as the appetite for finding the information from data as well. Growing appetite for data analysis can’t be achieved by transactional databases. The intention of this project is to have a ETL module to interact with multiple DW compliance over which predictive modeling code could run. So, that healthcare provider can check upon the predictive modeling result based on historical data they are having/loading.
Thank You for your comments and suggestion about my ongoing project progress .
Allow raw SQL as one method for collecting data for transforms, since some of our queries for ETL are complex.
Yeah I considered it. I will add that option . But I am actually trying to avoid it was not looking safe because may be it is a security concern and an attacker can inject its query easily .
As you’ve started, allow for a variety of targets to be added over time (e.g., HIVE, MySQL, file output, SOLR, etc.)
Yes I considerd those issues like late night scheduling and various target output. For maximum memory or CPU Usage that didn’t come in mind because I am dealing with small data, but I will also consider it now . I will implement them soon
This can be addressed by requiring an additional privilege for creating/editing raw SQL transforms that could be given only to people who are trusted or already have full access to the system (e.g., administrators, developers, etc.).
I mean, for exampe, that users who have the “MySQL ETL Raw SQL” privilege will see the option to add or edit existing transforms that are raw SQL. Users with privileges to use the module but without this “Raw SQL” privilege may be able to see raw SQL transforms and schedule or run them, but they would not be able to edit or create them.
I would suggest creating an OpenMRS Task – i.e., extending AbstractTask as seen in these examples. Your task can have additional properties about the source and destination and your module can even provide a simple form to create & schedule these tasks, but admins will also be able to see them and administrate them through the existing task scheduler (admin/Admin123) within OpenMRS. You could put credentials into your TaskDefinition by using a password field when collecting the password and using a reversible hash to obfuscate the credentials stored in the scheduler_task_config_property table.