Store filtered tail messages into a new log file

tail -100f /path/to/log/file.log | egrep –line-buffered ” | tee /path/to/destination/file.log


virtualenvwrapper configuration

There are files for hooks and configuration is very straight forward.
I’d like to create a new folder named src for storing all the code base. Then, change directory to the new folder. Let’s see what I can do.

There is postrmvirtualenv file under $WORKON_HOME. In my case, it is ~/venv.
Let’s open up the file in vi editor and add some hooks.

$ vi ~/venv/postrmvirtualenv

# Add following to the file and save.
mkdir $path
cd $path

I’ve created a variable for the path of the source folder I mentioned. Then, create it and change directory to it.

Now, mkvirtualenv command lets you to create a new virtual env box and create a new folder called src under the project just created and change directory to the folder.

Let’s make some changes for workon command same as above.

I’d like to change directory to the src folder under the project once the project has been activated. Also, I’d like to change directory to the virtualenv root folder ~/venv once the project has been deactivated.

Open up postactivate and add below for activate behavior.


Open up postactivate and add below for activate behavior.


That’s it and enjoy.

git setup with multiple keys(accounts)

I had 2 accounts in and I couldn’t use one rsa key to access both accounts. It seems like gitlab’s restriction.

So, I needed to create another key for the 2nd account. Let’s try.

  1. Generate a new key
    $ ssh-keygen -t rsa -b 4096 -C ""
    Generating public/private rsa key pair.
    Enter file in which to save the key (/Users/myid/.ssh/id_rsa): /Users/myid/.ssh/new_rsa
    Enter passphrase (empty for no passphrase):
    Enter same passphrase again:
    Your identification has been saved in /Users/myid/.ssh/new_rsa.
    Your public key has been saved in /Users/myid/.ssh/
  2. Update ssh config file
    $ vi ~/.ssh/config
    Host gitlab-com-new
      User git
      IdentityFile ~/.ssh/new_rsa
  3. Add the new key created to ssh agent
    $ ssh-add ~/.ssh/new_rsa
  4. Add or update the git repository path
    $ git remote add origin git@gitlab-com-new:new_id/new_repository.git
    # or if the repository already exists
    $ git remote set-url origin git@gitlab-com-new:new_id/new_repository.git
  5. Enjoy 😛

Getting the last sub entries per entry

Now, I am trying to get the sizes of the last created versions of each contents.

Here’s the simple model diagram.


It looks just like a blog post and its comments. Here’s sample data.

Content model

id name
1 My Content 1
2 My Content 2

Version model

id parent_id size
1 1 100
2 1 200
3 1 300
4 2 400
5 2 500

The records with grayed background would be the result set as following table.

Expected result set (Versions only)

id parent_id size
3 1 300
5 2 500


My first thought was using group by with Max() function which requires a join or sub query. The 2nd thought was using distinct, but I wasn’t sure that grabbing the last ones only per parent. Anyway, here are the django orm and the sql queryset generated.
I tested with django-debug-toolbar’s debugsqlshell command which shows sql query strings per command with execution times. Below, I discard all execution times from the result because the result may vary based on the performance of the machine tested with.

$ ./ debugsqlshell

>>> Version.objects.distinct('parent').order_by('parent_id', '-created_at')
SELECT DISTINCT ON ("content_version"."parent_id") "content_version"."id",
FROM "content_version"
ORDER BY "content_version"."parent_id" ASC,
         "content_version"."created_at" DESC

Now I need to get size of each versions.

>>> Version.objects.filter(parent__owner=owner).distinct('parent').order_by('parent_id', '-created_at')
SELECT DISTINCT ON ("content_version"."parent_id") "content_version"."id",
FROM "content_version"
INNER JOIN "content_content" ON ("content_version"."parent_id" = "content_content"."id")
WHERE ("content_version"."is_archived" = false
       AND "content_content"."is_archived" = false
       AND "content_content"."owner_id" = '65ed9958-788e-11e6-9b70-a820663cfc41'::uuid)
ORDER BY "content_version"."parent_id" ASC,
         "content_version"."created_at" DESC

Now I need to add a filter to getting none-archived records and owned by a user.

>>> Version.objects.filter(parent__owner=owner).distinct('parent').order_by('parent_id', '-created_at').values_list("size", flat=True)
SELECT DISTINCT ON ("content_version"."parent_id") "content_version"."size"
FROM "content_version"
INNER JOIN "content_content" ON ("content_version"."parent_id" = "content_content"."id")
WHERE ("content_version"."is_archived" = false
       AND "content_content"."is_archived" = false
       AND "content_content"."owner_id" = '65ed9958-788e-11e6-9b70-a820663cfc41'::uuid)
ORDER BY "content_version"."parent_id" ASC,
         "content_version"."created_at" DESC

So far so good. 🙂
Oh, Now I want to get the total size of all the last versions.
I tried to add aggregation to get Sum() of size, and the django's orm results broken queries. If anyone knows how to get the result with django's orm, please let me know. I may buy you a cup of coffee or beer sometime.

Anyway, so I decided to use raw sql for the result and I didn't want to use python's sum() function because the number of records was arbitrary and may get lots of records that would make the application slow down.

Here's the last code snippet I used.

def get_storage_bytes_used(owner):
    Returns total bytes used for user.
    The calculation is Sum of latest versions of each `Content` records.

        owner (User): An owner of Content object.

        bytes_used (int): Total bytes used for a given user.
    query = '''SELECT sum(size) from (
        SELECT DISTINCT ON (v."parent_id") v."size"
        FROM "content_version" AS v
        INNER JOIN "content_content" AS c ON (v."parent_id" = c."id")
        WHERE (v."is_archived" = false
            AND c."is_archived" = false
            AND c."owner_id" = '%(owner_id)s'::uuid)
        ORDER BY v."parent_id" ASC, v."updated_at" DESC) as t'''
    context = {'owner_id':}

    with connection.cursor() as c:
        c.execute(query % context)
        bytes_used = c.fetchone()

    return bytes_used[0] or 0

Happy coding.

ignore certain files(types) from git diff

Sometimes it’s silly when reviewing the code change from git diff command which is a simple way to see the differences between the codes from the previous and the updated and the commit contains a large css file. Actually I use less for generating css, so I don’t really need to open up and see the css file. Anyways, you can ignore any file you want from git diff. Here’s how.

You can create a .gitattributes file in the root of the repo. Inside that file, you can now exclude certain files/types from the git diff command just like the way of excluding files from .gitignore:

path/to/file.binary -diff
path/to/files/*.css -diff

Django @property executes extra database calls? Ensure using cached result instead.

People made mistakes as below code that executes database call every time it’s invoked.

class MyModel(models.Model):
    def is_ready(self):
        status = self.check_status_from_db()
        return True if status else None

Assuming that there’re 2 or more {{ object.is_ready }} in your template especially in a loop on a list page. See what would happen via django debug toolbar.

Let’s fix the issue with the below code.

class MyModel(models.Model):
    def is_ready(self):
        if not hasattr(self, '_is_ready'):
            status = self.check_status_from_db()
            self._is_ready = True if status else None
        return self._is_ready

Now, retest the code and see what’s difference.