Connect to Databases

Importing Source Data from Relational Databases

Connecting & refreshing from relational databases

Omniscope includes a Database Connection dialogue to help establish direct connections to database tables and reporting views. The dialogue is accessed via the Main Toolbar: File > Connect to database command. Using the dialogue, anyone can import the results of any SQL statement/query that returns a table.

In order to connect directly to a database (rather than importing a delimited .CSV or .TSV data file that has been exported from a database) you need to know the type of database and its connection details. From the Database Connection dialogue, enter the connection details for your database and choose which database table or reporting view you wish to retrieve data from. By default, this will create a persistent linked data source relationship between this database table/reporting view and the .IOK (Omniscope) files you are creating.

If you connect via ODBC, you can access multiple transactional tables, but only one database. If you connect via JDBC, you can retrieve data from multiple tables in multiple databases, and performance will also be better. If you do not know how to write the SQL query that assembles the reporting view you want to import into Omniscope, ask your Database Administrator to add the query in Omniscope. Every subsequent refresh re-creates the reporting view, drawing the latest information from transactional table(s) within the database(s) to create a single, 'flattened' tabular view of the data for scrubbing, analysis and reporting.

Connecting via ODBC

ODBC is a technology for connecting to most databases and requires you to set up a named ODBC data source in Windows. This is the approach needed to connect to an Access .MDB database, for example.  In Windows XP, for example, go to Control Panel, Administrative tools and choose Data Sources. Click Add, choose the Access driver (or otherwise, depending on your type of database), choose the appropriate .MDB file and enter a suitable Data Source Name.

Connecting via JDBC

If you have a database that supports JDBC, or wish to customise the database import behaviour, you can connect via JDBC using a JDBC JAR driver file obtained from your database vendor or 3rd party. This may need to be specific to your database version as well as vendor. You will need to know how to construct the "JDBC URL" for your database. If your database is covered in the examples, follow the instructions on the example page for your database. If your database is not covered in the example list, see the JDBC generic instructions.

The Database Connection dialogue sequence of screens allows you to choose JDBC and enter the JDBC JAR driver path such as:
"C:\Program Files\MyDatabase\Java\JDBC.jar", and the JDBC URL such as "jdbc:mydb:products".
Enter the username and password, if necessary, and choose the table or view you want to look at.
On the next sceen, you can customise the SQL statements, allowing you to JOIN multiple tables on the fly. Only change these values if you are familiar with the implementation of SQL for your database vendor and version.

 

Relational Databases as linked sources

Any Omniscope file serves as a template file for future data refreshed versions of the file. Omniscope template files remember and refresh from their data sources. If the source is a relational database, Omniscope will re-execute the embedded SQL statement and any subsequent merges, joins and aggregations on every refresh.

Saving the connection and configuring refresh

To save the connection details you have entered into the Database Connection dialogue for subsequent refreshes, save the current file using File > Save. The saved file will contain a (highly-compressed) snapshot of the data as it was when imported, plus all the database connection parameters, including the SQL statement that returns the table you entered into the dialogue. You can configure refresh behavior each time you (as owner) save the file. You can test refreshing from the database specified in the Database Connection dialogue by choosing Data > Refresh from source.

Setting Field (column) data types

Unlike text-based delimited data file import (such as XLS or CSV), Omniscope when the linked source is a direct connection to a relational database does not automatically inspect and convert data types. Instead, the data type descriptions of the fields in your database are relied upon to determine data type in Omniscope. If your database declares the field "CUST_AGE" as Text ("Char", "String" or "Varchar"), Omniscope will treat it as Text also. To avoid the need for manual changes, you must structure your source database correctly by declaring the field "CUST_AGE" as a numeric field (for example).

Editing data from databases

At present, relational database connections in Omniscope only go one way, i.e. they only import a copy of the data in the database table or reporting view. You can make edits but cannot submit your changes directly back to the database table automatically. If you plan to edit/scrub data from databases in Omniscope, you will need to agree a way of submitting documented corrections and additions back to the Database Administrator, usually by exporting a .CSV or a human-readable, parse-able .XML corrections file such that the Administrator can use simple tools to import the corrections made in Omniscope back to the source database.


Suggestions: Don't make too many pre-selections in the SQL statement query assembling the reporting view. Errors in fields can result in some records not being included in the reporting view if you are being selective. Try just dumping all records for each major entity (e.g. people and all fields linked to people, places and all fields linked to places, things and all attrributes linked to things etc.) into a few large reporting views extracted from the database(s). Open these views in Omniscope and save them on the server - effectively creating 'data marts' in Omniscope file format. You can then re-name/correct/scrub and filter/hide unwanted data visually and create 'child' Omniscope files which will refresh from the (server-based) 'data mart' Omniscope files, which are in turn automatically refreshing from the data warehouse.

Warning: Preserving changed and merged data: Omniscope does not currently support partial refresh from database tables/views. Refreshing from linked data sources will overwrite all the data in a given Omniscope file with data from the linked source table only. If you have changed data in the Omniscope file manually, or merged/pasted columns of data from other sources, such as spreadsheets or departmental databases, your changes will be lost on file refresh from the linked source. If you are using Omniscope to merge data from multiple sources, keep your merged data and commentary in separate .IOK, .CSV or .XLS 'merge files' to avoid those values being lost every time the main data file is refreshed from a linked source. You can prevent accidental refreshes from source by not ticking the option to maintain the link to source, but this will also remove your database connection parameters from the Omniscope file.