Analytics, statistics, data science, artificial intelligence. What does all this mean? What are we doing in data science in healthcare, specifically in revenue cycle management concerning financial data for healthcare?
First, what is data science? This is a way of categorizing or grouping a whole bunch of things related to big data computer science, which is writing software effectively, algorithms, applications, statistics, analytics, and many other things. So it’s this all-encompassing umbrella term that’s been developed to group all these things because there wasn’t this cohesive, unifying way of saying that until relatively recently. This article was initially available on our podcast. Click here to listen.
What is Analytics in Data Science in Healthcare?
Analytics is a sub-discipline of data science in healthcare. It’s the area that we’re most focused on, although I’ll come back to that in a second. What’s the difference between analytics and statistics? One definition I’ve heard is that analytics is all about finding good questions while statistics is about finding good answers. You’ve heard us talk a lot about coming up with questions and making sure the questions are well structured so that you can ultimately get good answers. But I’m not sure I fully agree with this definition, although I understand what they’re saying.
Analytics is concerned with what’s in your data, again, by one definition. Statistics is beyond your data. There’s certainly something to this where statistics, for example, may tell you what you believe about an entire population from a sample because you don’t have data on the whole sample.
An example
Think about an upcoming election. You want to know how people are going to vote. You’re not going to poll 100 million people and get answers from all of them. That would be all of the data, and we could then analyze that data. Or you could take a small subset of that and project from that into the larger population using statistical methods.
I get that definition. There’s certainly something to that. As an organization at Apache Health, we are primarily concerned with analytics; in my opinion, although we do some statistics, we do some machine learning, and we do some other stuff as well. We write software, so that means computer science and more.
What is AI in Data Science in Healthcare?
What’s artificial intelligence in data science in healthcare, aside from being a movie and the subject of many apocalyptic things like the Terminator Skynet? Artificial intelligence is anything where a machine emulates a human process that involves something we would do intellectually with our mind, involves our minds, not just something physical. So it’s not a robot. That means artificial intelligence is doing something “intelligent.”
Google’s search engine is considered a form of artificial intelligence (or AI). I’m not knocking Google’s search algorithm; There’s something to that. There are many billions of dollars, perhaps even close to a trillion dollars in value. So it is undoubtedly valuable. But it’s not what most people would think of as excellent, sexy AI.
What about machine learning?
Machine learning is a branch of artificial intelligence that involves using data and algorithms that allow a system, a computer, or a series of computers to learn to improve over time automatically (in other words, on its own). So effectively, we put some structure and guide rails in place, and we give data to train something. And then, it projects or uses that different data to conclude.
Hopefully, there’s a feedback loop built into the programming that says that it will learn from how it does that. So it’s going to run one set of calculations, one type of analysis, or one type of calculation and say, “Ah, I get this type of results” or “It fits this well. And if that works, great, I will keep that information. But then, I’m going to test something else. And if that methodology produces a better prediction than the first one, I’m going to go with that one instead.”
The system is learning
All that should happen automatically, and it should, ideally, continuously or on its own, go through that process of figuring out what works better than something else. That way, a human isn’t selecting the statistical methodology like linear regression, but the system is doing that independently. That’s machine learning.
The most common languages for machine learning are Python and R. There are also things like deep learning and neural networks, but that’s really for another day. I will leave you with this quote from Matt Velloso related to what defines artificial intelligence versus machine learning. The quote is this, “If it’s written in Python, it’s probably machine learning. If it’s written in PowerPoint, it’s probably artificial intelligence.”
I think that’s hilarious. I think it’s a brilliant quote because we see the term “artificial intelligence” thrown around so much. Many of these claims are BS, and that captures it so perfectly because it makes fun of all the people running around and claiming they have artificial intelligence.
To summarize
We see this a lot in our industry. There’s a reasonably large number of claims that some artificial intelligence is involved. Increasingly now, people are even saying that there’s some machine learning. I think this is frequently somewhere between aspirational and perhaps, more cynically, marketing or perhaps worse, outright people fabricating what they’re doing. That’s our take on data science, statistics, analytics, artificial intelligence, machine learning, and all of those relationships and sub-disciplines.