Health care data comes in many different forms. The data is collected, organized, and analyzed to improve patient outcomes and health care processes. Data mining is a technique that researchers use to look for hidden patterns and relationships in large amounts of data. In this discussion you will evaluate a large data set complied by the U.S. Department of Health & Human Services (Links to an external site.). This data set contains breeches of protected health information affecting 500 or more individuals. Use the data set to address the following:
- How many records are in this file?
- What is the name of the covered entity in your State that had the highest number of individuals affected (be sure to identify your state and the name of the institution)?
- What type of breach occurred?
- Describe the type of violation.
- How could this violation have been avoided?
- How would you apply the data mining methods of clustering and association rule mining to this file?