Menu Close

Dr. Kum is cross trained in Computer Science (PhD in data mining) and social work (Masters of Social Work in the Macro track focusing on policy, management, and community organizing as opposed to clinical SW) with over 15 years of experience in using big data about people (e.g., government administrative data and EHR) to support timely evidence based decisions in research, policy analysis, evaluations, and clinical care. As one of few data scientist cross trained in the social and health domain science and computer science, her main research interest is in how to use the abundance of existing digital data, aka big data about people, to support accurate She is an expert in (1) record linkage and privacy, (2) sequential pattern mining, and (3) secure data infrastructure (e.g., online open data portals, computer security, IRB, data governance) for handling person level data . She currently serves on the Texas state IRB, TAMU IRB, and the Big Data Committee at Texas A&M University. She founded and currently leads the Population Informatics Lab that brings together computer scientists, statisticians, social scientists, health service researchers, and ELSI researchers to answer critical questions in SBEH (social, behavior, economic and health) sciences using preexisting big data about people as well as methods to support ethical use of big data about people. Population informatics applies data science to social genome data (digital traces of person level data) to answer fundamental questions about human society much like bioinformatics applies data science to human genome data to answer questions about individual health.

Experience

  • 2020–present
    Professor, Texas A&M University
  • 2013–2020
    Associate professor, Texas A&M University
  • 2012–2014
    Research associate professor, University of North Carolina at Chapel Hill
  • 2004–2012
    Research assistant professor, University of North Carolina at Chapel Hill

Education

  • 2004 
    University of North Carolina at Chapel Hill, PhD Computer Science (Datamining)
  • 1998 
    University of North Carolina at Chapel Hill, MSW Social Work (Policy & Management)
  • 1997 
    University of North Carolina at Chapel Hill, MS Computer Science

Publications

  • 2018
    Balancing Privacy and Information Disclosure in Interactive Record Linkage with Visual Masking, Proceedings of the SIGCHI conference on Human factors in computing systems.
  • 2017
    Post-acute care for children with special health care needs. , Disability and Health Journal. Sep 2017
  • 2015
    Using big data for evidence based governance in child welfare., Children and Youth Services Review (2015), Volume 58, November 2015, Pages 127-136, ISSN 0190-7409, doi: 10.1016/j.childyouth.2015.09.014.
  • 2014
    Former foster youth: Employment outcomes up to age 30., Children and Youth Services Review, 2014. 36(0): pp. 220-229.
  • 2014
    Privacy preserving interactive record linkage (PPIRL)., J Am Med Inform Assoc, 2014;21:212–220. PMCID: PMC3932473
  • 2014
    Population Informatics: Tapping the Social Genome to Advance Society: A Vision for Putting Big Data to Work for Population Informatics., IEEE Computer Special Outlook Issue. Jan 2014. pp. 56-63.
  • 2013
    Privacy-by-Design: Understanding Data Access Models for Secondary Data, AMIA Summits Transl Sci Proc. 2013: p. 126-30.
  • 2009
    Supporting Self-Evaluation in Local Government via KDD. , Government Information Quarterly: Building the Next-Generation Digital Government Infrastructures, 26(2):pp 295-304, April 2009, Elsevier.
  • 2007
    Benchmarking the Effectiveness of Sequential Pattern Mining Methods, Data & Knowledge Engineering (DKE), 2007:60(1):pp30-50
  • 2003
    ApproxMAP: Approximate mining of consensus sequential patterns, Proceedings of the Third Siam International Conference on Data Mining

Grants and Contracts

  • 2019
    Evaluation of the 1115 Medicaid Waiver Demonstration in Texas
    Role:
    PI
    Funding Source:
    Texas Health and Human Services Commission (TX-HHSC)
  • 2017
    Privacy Preserving Interactive Record Linkage (PPIRL) via Information Suppression
    Role:
    PI
    Funding Source:
    Patient-Centered Outcomes Research Institute
  • 2017
    Collaborative Research: A Benchmark Data Linkage Repository (DLRep)
    Role:
    Site PI
    Funding Source:
    National Science Foundation
  • 2017
    Predicting Suicide-Related Outcomes Using Sequential Pattern Mining
    Role:
    PI
    Funding Source:
    U.S. Department of Veterans Affairs
  • 2016
    Centers for Agricultural Safety and Health (U54)
    Role:
    Site PI
    Funding Source:
    The National Institute for Occupational Safety and Health (NIOSH)
  • 2007
    Creating Indicators and Improving Outcomes: Analytic Assistance for Child Welfare, Work First, Food and Nutrition Services, and Employment and Training and Career Start in NC
    Role:
    co-PI
    Funding Source:
    NC Department of Health and Human Services

Professional Memberships

  • ACM
  • IEEE
  • AMIA (FAMIA)

Honours

Texas A&M Presidential Impact Fellow; FAMIA; Royster Fellow