Sep 2, 2013

Yet Another Programming Language (YAPL)

I started, as I imagine many others have, with C# and Windows development.  It wasn't easy.  I started with basic application development that included forms, business logic, some database interaction, and a little multithreading with background workers.  At the same time, I began running Linux on my personal laptop and desktop at home.  That's when things started to roll a bit faster.

Linux really started to shine, for me, as I figured out how to make things work the way that I wanted them to.  I went from one configuration file to another, breaking things and then fixing them.  I got to the point where I started looking at programming languages.  And, at some point, I started writing Python.  

It was pretty cool.  The forced indentation was a little odd, in the beginning. But, I got use to it.  It was my first major scripting language (after shell scripting, of course).  I started using Vim.  I learned about sandboxing applications (ala virtualenv), deployment, maintaining packages, and managing 3rd party library dependencies.  I also learned about multiprocessing, static code analysis tools like Pylint, unit testing, the global interpreter lock, meta-programming, and logging (though, not necessarily in that order).  I grew to really enjoy Python.  Then I did some PHP on the side.

And, well, that wasn't quite the pleasurable experience that I'd hoped it would be.  It was probably the code base that I was operating in, and how much technical debt they were trudging through to try and get things done.  At this point I was also trying to get some experience with compiled languages.  So, after a brief comparison and review of C and C++, I chose to stick with C++.  It was also at this point that I wanted to focus a little more on my knowledge of data structures and algorithms.  

Which pretty much brings me to the present day.  I've had the opportunity, in the last six months, to work with C++ full-time (with the occasional Ruby and Bash thrown in).   I have and continue to enjoy using C++ as a language.  Its powerful, and you can do pretty much anything you want with it.  As you can with Java and most other mature programming languages.   So, now that I've worked with Ruby and C++ a little, I've found myself in a strange position.  I tend to do a lot more bug fixes in C++ than new development (about 98% bug fixes).  But, I've been supplementing with personal projects that focus on new development in C++.  And, that's been pretty cool.  There were some things that bugged me like, why does unit testing have to be so laborious in C++, or why do I have to write so much boiler-plate code to get a nice logging API for syslog?  These two things would probably not have bugged me so much if I'd dealt with them during the day-time, but I was doing development late at night when I was tired -- so the aggravation was magnified.  I'd even worked with templates for a while, and implemented a few algorithms to see how they would both look and perform.  I'd worked with sockets, file descriptors, and multiprocessing.  So, I felt like I'd been exposed to a lot of what the language had to offer.  

Now, I will say this, for all you hardcore C++ programmers.  I could spend a decade working with that language and still not master it.  The thing is, a software developer needs to keep a balance between his (or her) desire to become good at any one language with his desire to stay relevant.  Staying relevant is hard to do.  I can name a few things that will probably help non-GUI programmers like me to stay relevant these days:  
  • Practicing searching and graph algorithms
  • Designing and working with fault-tolerant distributed systems
  • Working with Apache Hadoop and Map Reduce
  • Working with Machine Learning algorithms

--  The YAPL moment --

I want to stay relevant, period [1].  There is no desire that's greater for me than to stay relevant and be useful as a software developer.  So, I'll plug in a new language and try to do some awesome stuff with it.  

There are some things that I want to focus on:  
  • Concurrency - without reinventing the wheel
  • Parallelism - again, without reinventing the wheel
  • More application with less code
And, there are some things I want to learn about:
  • Machine Learning
  • "Big Data" -- (whatever-the-crap that means), 
  • Artificial Intelligence
  • Data Analysis

After reading several posts on Reddit (mostly in /r/programming), I've found Scala to be most intriguing.  I'd always thought that java/scala was pretty close to C# in terms of syntax.  Scala seemed to be an improved (slightly less verbose) language for the JVM.  Boy, if there was ever a case of not knowing what I didn't know, that was it.  C# is nothing like Java/Scala, when it comes to everything that happens outside of the editor.  You can use an integrated development environment, or just the command line with the scala compiler.  I prefer Vim and the compiler.  It keeps things simple.

Anyways, that's how I plan on staying relevant.  :)  Stay tuned for more on this.

-- 

1:  Suggestions on how to stay relevant are appreciated, feel free to leave a comment below.

Jul 26, 2013

Performance Metrics and Successful Products

Its been a while since my last post, but I definitely want to keep this blog up and to stay active here.  So, here's another one.

Terms


"the team" - The team that works together to execute the vision of a product in order to deliver it to the customer so that we all get paid at the end of the day.

Who should care about performance metrics?


Well, everyone on the team.  Several articles and videos on performance come to mind.  I've heard it mentioned that improving the performance of a web application makes people lots more money and increases the engagement percentage.  I'm sure you've had the experience of dealing with slow things versus a performant system of some sort.  When things perform well, you have a positive experience (providing that they work properly, also).  When things are slow, even if they work properly, you still have a bad experience.

I would encourage product managers that read this post to consider adding functionality to their product that will help the team to measure the performance compared to other valuable metrics which are normally used to identify success like conversions, time-per-page, time-logged-in, time-on-the-site, number-of-users, etc.  This helps to show how performance of the product is directly tied to the success of the product as a whole.
Now, don't get me wrong, its hard to build things like this into the product.  For one, it slows the overall product down if you do it in the wrong way.  And two, its hard to get the management support for something that, in the end, the user should not be impacted by (if its done right).

What should be measured (and by whom), when measuring performance metrics?


I've specifically added the "by whom" part here because of a recent experience.  If one is stress/load testing a system then one should not really expect the system itself to be able to report its own performance metrics in a timely fashion.  That's because it should be too busy doing what it normally does to be able to respond to you.  That said, what should or can be measured might include the following:

  • Time per database transaction/query.
  • Time to render the page.
  • Time to get a piece of data from point A to point B.  
  • Time to perform [some cool operation].
  • Time to complete a method call.
  • Time to construct/destruct an object.

If you haven't already noticed, there's an important pattern here.  Its the time taken to perform an operation:  (end time stamp - start time stamp) / (number of operations performed).  If the number of operations is quite large and your timing each operation individually, then the timing itself may be severely affecting the performance of the application and I would recommend you time in batches, or time the whole group of operations together and take the average.

Where should the measuring of performance metrics be performed?


This might sound strange to product owners or quality assurance personnel, but performance metrics should be taken from a system that is built, configured, started, run, stopped, and torn down by an automated process.  I would recommend using Vagrant with Puppet here (if you're using *nix), but really any scripted automation combined with virtualization should do the trick.  Start small, iterate quickly, and don't be afraid of change as its built.  Also, don't couple your testing/measuring infrastructure to your product.  If you build a super cool way to compute the number of seconds between two time stamps, don't use that same library when measuring performance.  If that library breaks, for whatever reason, your testing infrastructure is also hosed.

When should performance metrics be measured?


From the beginning, frequently through the middle, and always at the end.  That pretty much covers it. :)

You should probably build performance testing, after its automated into the build steps.  If you have a continuous integration environment setup, then you probably already have automated unit testing, and then functional testing.  Assuming you have:  Made code changes, run unit tests locally and they pass, checked-in code changes, pushed commits to remote, the following automated steps could be setup:

  1. Unit tests are executed again
  2. If the unit tests pass, setup one or more VMs for functional testing with minimal hardware specifications.
  3. Perform functional tests.
  4. Tear down the VMs used for functional testing.
  5. If the functional areas perform as expected, setup one or more VMs for performance testing with more realistic/production-like hardware specifications (one might even use Amazon EC2 for this).
  6. Tear down the VMs used for performance testing.
  7. If the product performs as expected, then deploy to staging and notify QA to manually verify.
  8. If the product looks good after cursory review by QA personnel in staging and test results from unit/functional/performance tests look good, then deploy to production.
Whoa.  Did you say push to production after a cursory glance at test results and the product itself?  Yes, I  did.  :)  That's how a beautiful, automated, continuous integration environment is supposed to work.

Why should we measure performance metrics at all?


Because, in the end, that's what your customers want.  They want a product that they can use, and usability is directly tied to the product performing well.


-- Hope you liked this article.  My next article is going to be about performance issue investigation.

Apr 28, 2013

Ieiunium Tela -- Fast Web 1.0.0

This is pretty insane.  I just finished up roughly three months of work toward my goal:  create a viable web server in C++.  Here's my historical commit graph for that time period.

And, as usual, my goals for the project have changed over time.  I've gone from creating a TCP server to creating a HTTP server, to creating a HTTP server with loadable module capabilities.  Yes, I'm ambitious as all-get-out.  :)  My wife will never dispute that. 

Here's my basic test:
bitcycle cypher [git:master]
  => ~/git/IeiuniumTela
 $ curl localhost:8080/foo/bar/baz -i
HTTP/1.1 200 OK
Date: Sun Apr 28 22:15:03 2013
Content-Length: 28
Content-Type: text/html
User-Agent: IeiuniumTela
You requested:  /foo/bar/baz
 And, here's the server responding. 
bitcycle cypher [git:master]
  => ~/git/IeiuniumTela
 $ ./build/main.bin conf/default.yaml    
[DEBUG] Creating server using port 8080
[DEBUG] Listening...
[DEBUG] Parsing request
[DEBUG] [127.0.0.1] GET request / 89 bytes
[DEBUG] Userspace handling request
[DEBUG] Creating response
[DEBUG] Formatting response
[DEBUG] Returning response

Feel free to browse the code, if you like.

The final question here is, will I continue development on this thing?  

Hell yes.  :)