Being able to troubleshoot serverless functions in the cloud may be one of the most difficult transitions for those coming from more traditional VM-based development/deployments. You can’t logon to the VM or docker images that is running your Lambda, so you have to have good logging/tracing. .
We have been using AWS XRay and CloudWatch primarily for the last 6 months to trace/log our AWS Lambda functions. Overall its been a good experience and we have been able to troubleshoot any issues that have cropped up. The only real pain point has been searching for the right CloudWatch logs. Let me explain…
The Problem
Each Lambda execution has its own CloudWatch log so depending on how many concurrent executions you have, it can be difficult to find the right CloudWatch log in a log group. You may also deploy a new version of you Lambda which will create even more logs. Additionally searching through a big log and finding the right timeframe for the particular log line can be frustrating as well.
Thankfully, AWS has made the first issue easier by allowing you to search the entire log group:
After many troubleshooting sessions I finally got fed up with all the searching and decided to streamline the troubleshooting process further.
Add CloudWatch Links
We often start troubleshooting in XRay because its provides a view of our distributed traces across many AWS services. So I created a TamperMonkey script to add a CloudWatch Logs link for each Lambda execution trace listed in XRay.
TamperMonkey is browser plugin which allows the end user to add Javascript to the any webpage.
Basically I get the timestamp of the trace and the function name for the trace record where the type is “AWS::Lambda”, create the appropriate CloudWatch link and add it to the page. Here is the script:
Array.from(document.getElementsByClassName(“group-type”)).forEach(function(element, index, array) {
console.trace(element);
if (element.innerText == “AWS::Lambda”) {
var lambdaName = element.parentNode.childNodes[1].innerText;var when = document.getElementsByClassName(“timeline-overview”)[0].firstChild.childNodes[1].firstChild.childNodes[3].innerText;
console.trace(when);
var date = /(?<=.*\()(?<date>[\d-]+)\s+(?<hourAndMin>\d\d:\d\d+)(?<sec>[\d:]+)\s+\w*(?=\))/.exec(when);
var dateText = date.groups.date + “T” + date.groups.hourAndMin;
var hour = parseInt(date.groups.hour);
var start = dateText + “:” + date.groups.sec + “Z”;
var end = dateText + “:59Z”;var link = document.createElement(“a”);
link.appendChild(document.createTextNode(“CloudWatch Logs”));
link.href = “https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#logEventViewer:group=/aws/lambda/” + lambdaName + “;start=”+start+”;end=” + end;
link.style = “font-weight: bold;”;
link.target = “_blank”;
console.trace(link);
element.parentNode.appendChild(link);
}
})
Its not the cleanest Javascript, I know but it works and I don’t think it worth investing more time in. I’m hoping AWS will also see the value here and add their own link in the XRay console. Until then you can install my script and use TamperMonkey.
Usage
A few notes on using the script.
The first time you load the XRay trace page the links don’t always show up. I suspect the page is still loading and the script runs to fast. Just refresh the page a few times and it will work.
Also I am defaulting the CloudWatch filter timeframe to the 59th second of the same minute from the trace time. You may need to adjust the end time to include all the logs you are looking for.
If this post was helpful or you have any questions/comments please leave a comment below.
Happy Troubleshooting!