Electronic medical records, now utilized in nearly every hospital and clinic in the United States, are really mostly just “dumb” raw text, making the data contained within them difficult to access from a database perspective. That, combined with the fact that there are competing systems and no real standard, makes the data difficult to share between clinics and researchers. From a computing perspective, the problem of ingesting medical records into a database—automatically, with little human intervention—is extraordinarily complex. The optical character recognition itself is only a small part of the puzzle.
Take, for example, a single lab report. In the following example, a medical record ingestion system needs to identify and process the following:
- Lab test type
- Test line items and data points with unit of measure
- Metadata like lab name (but ignore lab director name)
- Ignore accepted ranges and other unnecessary data
- Physician’s name
Of course, all this would be relatively trivial if these forms were standard across different systems. Unfortunately that’s not the case. There are seemingly endless variations on lab reports, patient data and related forms.
To help hospitals and clinics manage EMR data from disparate sources, we’ve developed an extraordinary, state-of-the-art solution for ingesting medical records quickly, accurately, and painlessly. MST’s natural language processing engine understands and extracts the valuable data contained in PDFs generated by EMR systems.