Scaling/Performance

Scaling & Performance

Working with ever-larger data sets

The power of Omniscope on the desktop is due in part from its unique client-side in-memory architecture. All the data in the file is held in the memory of the local machine, at maximum granularity and with aggregated transforms, simultaneously. As files get larger in terms of row and column counts, the RAM required on the machine runing Omniscope increases. As computers gain ever more RAM and processing power at ever lower prices, Omniscope-based solutions become ever more compelling regardless of the size of the data sets. However, some scaling and performance considerations should be borne in mind when deploying Omniscope in situations where very large data sets or central IOK datamarts will be used. 

Scaling Omniscope solutions

Omniscope is a locally-installed in-memory application with no inherent data volume limits in terms of record or cell counts (cells = rows x columns). If you have sufficient RAM memory addressing capacity in your operating system (64-bit systems have much, much more than 32-bit systems), plus sufficient RAM and fast processors, the upper limits of Omniscope files extend to many multi-million record data sets on 64-bit systems running 64-bit Java with 4+ GB of RAM. We currently have clients running 34 million row data sets on mail order computers with 64-bit Windows and 16 GB of RAM. However, because reporting/publishing chains may also include less-powerful 32-bit machines on 'downstream' user desktops, reporting/publishing solutions involving very large data sets may require various optimisation techniques and tools to minimise the effects of the 32-bit memory-addressing bottleneck for files delivered to recipients' desktops. 

In general, highly-scalable Omniscope-based solutions should be organised as a 'waterfall', where the most RAM-intensive tasks are done on 64-bit servers or the desktops of a few power users with abundant RAM, such that the resulting 'downstream' report files are less RAM-intensive and perform well on the typical 32-bit desktops of the downstream users. In a fully-automated Enterprise Server-based, 'waterfall' implementation, a set of Omniscope files are maintained as single table (non-normalised) 'wholesale datamarts' refreshing directly from data warehouses/cubes running on the same powerful 64-bit servers with abundant RAM. Power users at the heads of the reporting/workflow chain can usually use aggregated or time-sliced (and therefore smaller) versions of these Omniscope files which have these wholesale 'datamart' files (not the data warehouse SQL reporting views) as their linked source. The desktop report files will be smaller, and can automatically refresh from the larger 'datamart' files kept on the server. The power users can then re-configure these smaller template files to generate/refresh yet smaller, more specialised reporting files for onward distribution to typical desktops/free Viewers. Smaller reporting subsets or 'dashboards' can be configured and exported as near universally-accessible Flash SWF DataPlayers for interactive web page display and embedding in documents by anyone in the chain with an activated Omniscope. (see below for information on the expanding data capacity limits of DataPlayers).

The sections below discuss options and issues related to Omniscope scaling and performance in more detail:

Topics:

[Legacy] Scaling DataPlayers (Flash .SWF files)

Stand-alone DataPlayers are a near universally-accessible file export option from any Omniscope data file. Due to inherent limitations in Flash, .SWF DataPlayers will always have much lower capacity for data in terms of record or cell counts (cells = rows x columns) than does a highly-scalable, locally-installed data analysis, management and reporting solution like Omniscope. In general, DataPlayers can contain at least 20,000 records (rows), but various factors specific to the data can affect performance and impose lower limits on record count. We are re-writing the Flash generating code in ActionScript 3, and expect both scaling and performance improvements in DataPlayers increasing the record limits above 500,000 records in a single .SWF file.

Smaller data sets, aggregations (e.g. daily data rather than hourly) or defined subsets (Named Queries) from larger Omniscope .IOK files can be automatically converted to DataPlayer 'dashboards' according to a routine schedule, or on-demand using personalised or 'permissioned' data sets delivered to the Generator from back-end repositories and/or analytical staging databases.

Topics:

Managing .SWF DataPlayer scaling & performance