Databases – Joe Ferr

In the work world that I’ve lived in, at times my team had to install, manage, upgrade and operationally support some databases and caching tools but this post is more about my use of these databases as an author of tools rather than talking about how I operationally handled things. That might come in a future post

Oracle

When I worked for GE around the end of the last century (whoa, feels like a strange thing to say) they sent me to a week long Oracle Database Administrator class since I’d been so involved in designing and implementing processes around tracking, applying and reporting on dml/ddl changes. The head of our DBA team even offered to hire me as a Junior DBA at one point but the idea of starting over as a “Junior” didn’t appeal to me

Pretty much everywhere I’ve worked uses Oracle alongside a plethora of other databases and data stores

this training was invaluable in that it gave me an architectural understanding of Oracle; when I see an error about a snapshot being too old because of rollback segment size I understand what that means. When performance is poor, I understand Oracle’s optimizers role and the impact of having out of date stats that match the current data/table sizes and data

And at every single job since where Oracle is used we’ve had this very dialog

developer/qa/product owner: why is the application so slow in the blurble environment when it's fast in other environments?

me: the size of the dataset in blurble is much smaller/different (e.g. sanitized) than in other environments. Check with the dba team to ensure that they have gathered stats recently, that the execution plans are similar to production, that all indexes that are in prod/should be in this env exist and that the oracle init parameters are the same here/are as expected

and it’s usually one of these things

SQLite, MySQL etc
I’ve written a lot of dashboards and tools and automation. Earlier on I’d usually use Postgres or MySQL because at the time (I’m really not sure now as of this writing) they were free and they pretty much used standard SQL syntax and were quick and there were good free tools to interact with this

SQLite is super easy to use/integrate with Python so I’ve used that a bit….especially for caching or a lightweight data store. Python has this super useful library https://pypi.org/project/requests-cache/ that can cache data/calls from the python requests module. I used this in my custom Kubernetes Dashboard KuberDashi Details to cache calls to kubernetes API endpoints (I think I set the cache time at 20 seconds, so if the data was older than that it would re-call the api) and to eliminate DOS attacks via my dashboard. It was an internal only tool, but you never know

I’ve used the Python SQLAlchemy ORM quite a bit; I used it in KuberDashi KuberDashi Details to store information that I queried from our f5 load balancer for example

MongoDB

In recent tooling I’ve used MongoDB quite a bit. If your data is or can easy be json or yaml it’s blazing fast and easy. For my My Custom Trivy Microservice Scanner, which has multiple front ends from a functional point of view and multiple processes feeding data into its datastore, it was perfect. The coolest simple to describe thing that I did was a text search across all scans of all production microservices to search for the presence of a specific vulnerability https://www.cve.org/.

suppose a new high profile vulnerability in a Linux library or a Java or Node library etc was found a week ago; A simple text search (which queries Mongodb) will show any/all microservices which contain that vulnerability

database