OpenI Plug-in for i2b2


OpenI plug-in provides a better UI for querying and downloading patient data from i2b2.

i2b2 (Informatics for Integrating Biology and the Bedside) is an NIH-funded scalable informatics framework. It enables researchers to use existing clinical data for discovery research and has wide international adoption.. It consolidates various clinical and biological (including genome) data and allows mapping to standard metadata. Once clinical research data sources aggregated into i2b2 framework are mapped to standard “ontologies”, i2b2 provides a web-based query interface to get patient counts matching various medical/biological conditions defined using standard terminology.

OpenI plug-in essentially integrates its business intelligence dashboard and reporting features to i2b2 data set. This way, instead of issuing queries one by one to find potential patient cohorts for a research, users get more of a "top down" view of the data, which they can slice and dice visually, and perform ad hoc anlaysis without having to write their own queries.

Technical Requirement

OpenI plug-in runs on top of Pentaho, which is based on Java J2EE, meaning it can run on any OS that supports a JVM (Windows or Linux). There isn't a separate server needed for the plug-in, just for the Pentaho instance. OpenI provides installations services to take care of both installing and configuring Pentaho as well as our plug-in.

OpenI plug-in will require building OLAP cubes from the i2b2 data in order to support the ad hoc drag-and-drop type analysis you saw in the demo. Depending on your data volume, it can either be done on the same machine, or would require a separate server.

Based on how many concepts you currently have in your Concept dimension (e.g. demographics, diagnosis, labs, medication, etc) -- the ETL job in OpenI will need to be configured (written as a Kettle job, which is the ETL tool within Pentaho) to pull data out of i2b2 accordingly and lay it out in a dimensional data model, off which we will then define and process the OLAP cubes.

Additionally, we assume that there will be 1 server machine to install Pentaho with OpenI plug-in. We also need an relational database to house the dimensional data schema that is used by the OLAP cubes. This can either be done on the same server machine where Pentaho + OpenI plug-in is installed, or if you already have a managed relational database server, we would recommend using that instance since it should be easier for your DBA to manage this database.

So, architecturally speaking, we need a server-class machine (starting at usually a quad processor with 8 to 16 GB of memory) with adequate disk space (we prefer faster disk speeds or a SAN) to house the data coming from i2b2 and build the OLAP cubes. This machine will connect to your i2b2 database to periodically pull data out, and will serve an HTTP interface for your researchers to access the dashboard and reports. The machine will have the following key software components:
  1. OS - Linux or Windows
  2. J2EE server - JBoss or similar
  3. Pentaho server
  4. OpenI plug-in
  5. RDBMS (preferably a column-based database like LucidDB, Greenplum, or Infobright in open source, or Vertica or VectorWise in commercial)

Comparing OpenI Plug-in for i2b2 to other BI and Reporting Tools

 Criteria  OpenI  Tableau Other Dashboard Tools (QlikView,SAP Crystal Dashboard, RoamBI, etc)
Open Source BI Platforms (Pentaho, JasperReports)
Proprietary BI Platforms (IBM Cognos, Oracle Hyperion, SAP Business Objects, MicroStrategy)
Has Pre-Built ETL integrated to i2b2 schema
 Yes No
No No 
Has Pre-Built Dashboards and interactive reports built off i2b2 data
 Yes  No  No No 
Has support and integration staff with i2b2 data experience
 Yes  No  No No 
Is open source
 Yes  No  No Yes
Supports both Linux and Microsoft OS
 Yes Microsoft only
 Yes  Yes Yes
Supports any JDBC-compliant database
 Yes  Only Microsoft SQL Server
  Yes  Yes
 Easy to integrate and customize
 Yes No
 No  Yes
 Cost  $  $$ $$$
 $$ $$$$$

Development Notes