Musings of dockerizing a Perl application
TL;DR — Containerize your application; it will be fun they said (so I tried to tell myself)
I’ve been busy learning about cloud native patterns, kubernetes, and docker lately, but haven’t had much keyboard time with the tech. I do have a tendency to soak up huge amounts of knowledge before I start playing while others skip the proverbial RTFM and start hacking away.
Over the weekend, I had the bright idea of modernizing an application I’ve organically built over the last 10 years. I’ll admit it has turned into a bit of a one-eyed green monster, but it has been doing the job and then some. Also, I didn’t have the knowledge/wisdom that I do now….and I haven’t had time to rewrite into something cool like Python. Selecting Perl was solely based on strong regex support and it seemed to be the automation language of choice….back then.
So if I want to move this application towards Kubernetes, the first step is to containerize it. Since I always ran it locally, the list of modules installed using CPAN grew overtime, but never had the equivalent of a requirements.txt or a POM file. So first step was to standardize the build environment. A bit of research led me to Carton and the use of a cpanfile. Taking my low grade tracking and converting to cpanfile was somewhat easy except that I am human and didn’t track every module. So it took many rounds of entering requirements. On the positive side, installing modules using Carton is much faster than CPAN (even with running with — notest).
A lot of my configuration file paths were hard-coded and used paths that are specific to Windows. I had to spend some time externalizing and standardizing paths. I ended up with using environment variables for CONFIG_DIR which I can either set locally on Windows using setx or by using ENV statement in Dockerfile. I have more externalization to do, but its a good start
Most of my logging was to standard out which was a good start, but the log output was not as verbose as I wanted. I dug further into Log4perl to improve the log format to include Log Level, filename, module:function and line number. I also started removing print statements in favor of the logger. I do plan to move the logger configuration to a file so I can change logging settings on the fly.
Selecting a Base Image
My Perl application has been running on Windows for years using Strawberry Perl. Most recently, I was running 5.26 in threaded mode (more on that later). For years, I was using the 32 bit version because one of the module I was using wouldn’t work in 64bit mode (it was calling a dll to a GUI application). I have now ripped out the Windows FTP client it integrates with; so 64 bit is all good.
To start out, i opted for ubuntu:latest where I’d install Perl and my newfound friend Carton. Unfortunately, carton install would fail installing XML::DOM for unknown reasons. I swapped out ubuntu:latest for perl:latest, then sorted out a final round of modules to install and tried to run the container. Turns out, perl:latest is not threaded. Final choice for base image; perl:5.30-threaded!
Looking back, I think some of the package behavior was related from the switch from Windows OS to Linux container. Also, some sub-modules I don’t ever recall installing had to be specifically called out on Linux (LWP::Protocol::https)
The final result is an image that is bigger than I wanted it to be; but I’ll leave this optimization to later once I get these battle scars!
REPOSITORY TAG IMAGE ID CREATED SIZE
patpicos/perltrader latest 680452c1e805 4 hours ago 1.01GB
perl 5.30-threaded b58ca9494993 7 days ago 858MB
Ports — How do you get in the front door
Applications living in containers have their own ports which need to be exposed to the Host machine so that external consumers can interact with it. The default behavior is to use TCP ports. My application accepts a firehose of UDP messages (perhaps I should decouple using a topic, but that’s outside scope for now). It turns out that you need to specify UDP when exposing with /udp
docker run -p 8080:3333/udp patpicos/perltrader
As I decouple my application, I still have some legacy to deal with as I evolve this dinosaur. The configuration used by the application is written out to the filesystem from a front-end web application every 5 minutes and is continuously re-read from the app. I didn’t want to bake those configuration files into the image, so I had to mount a volume from the local OS to the container.
As it turns out, mounting volumes on Windows OS is not as easy as it seems. When mounting a volume with the -v localpath:containerpath command line option, Docker would ask to allow sharing and request user/password combination. Being a Windows 10 user connected to Office 365 Account, I tried AzureAD\email_address and password. Over and over, it would not take it. Many Google searches led me to try:
- Reset docker to defaults
- Enable SMB v2 (through powershell)
- Multiple rounds of retries and frustration
I finally gave up and tried one final recommendation to create a local user and use that for mounting. Low and behold, progress. However, I needed to grant NTFS perms to that user to seal the deal.
Real World Testing
Now that I got through the many layers of plumbing and irritation, I have a container image that I can run and point traffic to. I monitored it for an hour and went to bed. I had the container set with the — restart on-failure so that it would restart in case of failure since I know the mysql module has memory leak issues. Low and behold, the container started restarting every 3–5 minutes overnight. I need to troubleshoot this more. I moved the container to my Linux server to see if it is a Docker stability issue on Windows. Overnight, the application kept erroring out unable to access the MySQL server; something I never had when running outside Docker. I’ll know more tomorrow after 8 hours of runtime.
This was quite the ride and I learned a lot in the process:
- Ensure configuration is externalized
- Good logging is a must have
- Maintain dependencies early on
- Containerizing is not as easy as it seems
- It worked on my environment and doesn’t (fully) work in the container is a distinct possibility
- I need to break down this sucker into more manageable pieces. Re-architect is in the works