API security has often been a blind spot for enterprises. In fact, it’s very common to see unauthenticated APIs. In most cases, these mostly occur due to overlooked authentication and authorization protection for the APIs in the development process. Sometimes, APIs are left without protections to be integrated with Authorization controllers in API gateways, which is another step for misconfiguration. From incidents like mHealth APPs, Panera bread, Fiserv, LifeLock, Kay Jewelers, and several others, API security had remained a crucial factor.
To dive deeper into the subject, CISO MAG interviewed Sanjay Nagaraj, CTO and Co-Founder of Traceable. Nagaraj is an entrepreneur and a Silicon Valley engineering leader. He believes in building products and teams that are obsessed with customer success.
Nagaraj discusses his entrepreneurial journey, API security and the potential risks associated with it, and the future of AI and ML in cybersecurity.
You have had quite a remarkable journey. From AppDynamics, to then staying in stealth mode with Traceable for a significant part of the time, to coming out with a $20 million investment. I think Traceable was born a unicorn. Can you summarize your entrepreneurial journey? Where did it all begin?
I believe in building products and teams that are obsessed with customer success. Prior to co-founding Traceable, I was VP Engineering for AppDynamics/Cisco. At AppDynamics, I was responsible for product teams for Application Performance Management and Database Monitoring products. Additionally, I was responsible for scaling teams across different geographic locations. The innovation that my team and I built was critical in helping DevOps teams to lead the digital transformation at many of the Fortune 100 companies. With the customer obsession of my team, and the products at AppDynamics, I was responsible for generating over half a billion dollars in revenue during my tenure. As a senior engineering leader, I have been building complex enterprise software solutions for over 20 years. Prior to AppDynamics, I worked at various companies including Hyperion Solutions (Oracle) and Philips. I am an inventor credited with several U.S. Patents.
According to major industry analysts, IT organizations struggle to evolve their processes for developing, delivering and managing APIs for integration and digital business transformation. Do you feel API security has become a blind spot for several businesses? How instrumental can a CISO be in this regard?
Yes, absolutely. Many organizations don’t have security practitioners as part of their developer cycle, and developers are not being trained on how to secure their APIs. APIs come in many forms:
- APIs that you expose to your clients can be applications (mobile or Single page apps).
- APIs that developers who are B2B customers or internal dev teams using them.
- APIs that are third-party or B2B APIs where requests and responses are expected from.
CISOs can be instrumental in putting the focus on their API security by asking some simple questions: “Do we know where our APIs are? How many APIs are internal vs. external? What types of users/roles access our APIs? Where is the sensitive data accessed?”
To get better at API security, development teams need to understand what we call their application DNA, understand how it is changing, and be able to identify anomalies to detect and block legacy and new threats. These new threats require an understanding of the application context and user behavior to really distinguish the bad actors from the regular users. Can you elaborate on “application DNA?” What is that and why is it important for application teams to understand?
Sure. Application DNA is what we call the collection of data that defines what an application is made up of, how those parts interact, how each of those parts behaves, and how the different users of the application interact with each of those parts. Knowing this data is vitally important to be able to effectively secure modern applications. But in these modern applications, this data is continuously changing, so it is not only challenging to collect this data but even harder to keep it updated. This is why we created Traceable, to use distributed tracing and AI to make it possible for teams to create and maintain secure applications in these challenging environments.
Cloud-native applications have clearly become hackers’ favorite targets. These applications are all API-driven, with APIs exposing business logic to the outside world. Do you think the current application security approaches are built for modern application architectures?
With the explosion of cloud-native apps and services, we now have an explosion of services and microservices, all talking to each other using different APIs. This has also drastically grown the number of unique APIs that are being used. As such, the number of clients has exponentially grown, between mobile, IoT, and other services calling each other, and data has become the new gold. So, the data, this precious thing, we are constantly handling it, manipulating it, and passing it off to other services which might or might not be safe. In general, there’s now a lot more to keep track of, and the interactions between everything are now more varied, more complex, and harder to keep track of. This is serious. Today’s app architectures have added a whole new attack surface at the API level.
One of the fundamental problems is that people don’t even know what APIs are being used and which APIs have a potential security risk, or which APIs could be used by attackers in bad ways. How can we get better visibility?
Better visibility is a must to be able to secure today’s cloud-native apps. To accomplish this, I believe the industry needs to shift to what we call Security Observability. Security Observability is the combination of service relationships, API DNA, data flow and risk, and user behavior analytics. Together, these give the visibility that is so critical for securing today’s apps.
Traceable extensively leverages AI and ML. But it can also be safely said that AI and ML are still evolving in several functionalities. Historically, one of the biggest difficulties for AI has been to distinguish between legitimate users and malicious ones. How does Traceable solve this problem differently?
One primary challenge for us when we were developing Traceable AI was developing algorithms to efficiently analyze massive amounts of data to trace user activities and detect anomalies and potential threats (traditional tools often focus on IP addresses – Traceable works with user identity). Our data science team developed advanced methods and algorithms to extract the accurate identity of the user from the data stream. They also developed unsupervised AI models to detect changes and anomalies, which are often indicators of malicious activity or attacks. As a result, our AI algorithm can continually learn and determine the difference between nominal and abnormal activity.
More specifically, AI has been adopted heavily and successfully where image processing or natural language processing (NLP) can be applied. This mostly applies to domains where the data on which models are built are mostly static and deterministic and large amounts of supervised data exist. The challenge in cybersecurity is that every environment is different. Similar to applications and APIs evolving, hackers are continuously evolving as well. So, fixed rules or static models don’t work. Instead, the predominant strategy used is anomaly detection, which is a strategy to discover the proverbial needle in the haystack. However, anomaly detection has been plagued with issues of false positives, primarily due to the algorithms lacking context. Traceable addressed this by building the Traceable platform that helps gather as much application context as possible.
The second issue plaguing solutions that use anomaly detection for cybersecurity is poor correlation capabilities. It is very hard to track the attacker’s path within a gamut of time-varying data of high statespace complexity. Traceable has taken a unique approach leveraging graph learning algorithms by breaking down the problem in a unique way to constrain the statespace thereby enabling these algorithms to become viable. I think the future of AI in cybersecurity is in leveraging graphs and understanding users within an application context. Graphs are hard to work with, but by constraining the state-space they become a viable solution to an otherwise complex problem. I think this fundamental change in thinking is what is needed to address the cybersecurity issues of the future. Using graph machine learning to track users and their data and how data flows through the system is the key.
Cybersecurity and legal teams have often been operating in silos, but this needs to change. Several times, inhouse legal teams handle some of their company’s most sensitive and confidential data, and law firms face an even more daunting security challenge, having to manage the highly confidential and privileged data of all their clients. How can you eliminate these silos?
More than with every company that collects sensitive data from their customers, law firms and legal departments especially need to make sure they are protecting their client’s data. The API vulnerabilities that lead to data breaches in all companies are equally as effective in software used by legal organizations. The difference is that in certain instances the legally sensitive data might be considered a more valuable target. But there is another part to this, which is the importance of being able to prove that the sensitive data is not being leaked. Auditing sensitive data flow was challenging before so many applications became cloud-native. Now, where this sensitive data can be handled by tens or hundreds of microservices distributed around the globe, tracking the flow of sensitive data is even more difficult. Keeping in compliance with sensitive data requirements now requires holistic visibility of how the sensitive data flows across the entire distributed application landscape.
Where is the future of AI and ML in cybersecurity headed?
In general, regarding AI and ML’s role in cybersecurity, I think today we still get a lot of eyes rolling up when we talk about this, because unfortunately there has been a lot of AI-washing, where companies have latched on to AI and ML as buzzwords, but never really deliver using it. But just like with cloud and cloud-native, there was a time earlier in the cloud hype cycle where this also happened (cloud-washing, cloud as a buzzword, cloud-everything without true customer value, etc). Eventually, cloud technology and the values it provides became real and clearly defined. The same thing will happen with AI/ML for cybersecurity, and I am proud that our team at Traceable is advancing the science and art to evolve this space towards that clearly defined state.
What is in store for Traceable?
Traceable will continue its mission to make businesses and the software they rely on more resilient, and we will continue to bring our expertise to the table to rethink how modern applications and APIs can be secured, both in production and preproduction. We have a lot of exciting capabilities on the way, which I don’t believe anyone else can do, and I look forward to being able to share them with everyone.
This interview first appeared in the July 2021 issue of CISO MAG. Subscribe now!