For those who like having control of their servers and need to validate many ideas quickly, the smallest overhead way I've found to get a service up and running that requires an exposed API, a backing filesystem, and a GPU is to write a Flask route to your core logic and then run it using gunicorn with some number of workers. As gunicorn uses green threads, it's extremely easy to scale up if needed, and for services where the traffic patterns are sparse but critical, it's great to have "multiprocess" Python and get past the GIL.

4 upvotes · 23.2K views
Avatar of kumquatexpress