[Python] get a list of sorted directories and/or files

This posts provides a piece of Python code to sort files, folders, and the combination of files and folders in a given directory. It works for Python 3.x. (It should work for Python 2.x, if you change the syntax of print statement to that of Python 2.x.)

Return the oldest and newest file(s), folder(s), or file(s) +folder(s) in a given directory and sort them by modified time.

import os

# change this as the parent directory name of the files you would like to sort

path = 'parent_directory_name'

if (os.path.isdir(path) and (not os.path.exists(path))):

   print("the directory does not exist")
else:
   os.chdir(path)

   # files varialbe contains all files and folders under the path directory

   files = sorted(os.listdir(os.getcwd()), key=os.path.getmtime)

   if len(files) == 0:

      print("there are no regular files or folders in the given directory!")

   else:

      #folder list

      directory_list = []

      #regular file list

      file_list = []

      for f in files:

          if (os.path.isdir(f)):

              directory_list.append(f)

      elif (os.path.isfile(f)):

          file_list.append(f)

      if len(directory_list) == 0:

         print("there are no folders in the given directory!")

    else:

        oldest_folder = directory_list[0]

        newest_folder = directory_list[-1]

        print("Oldest folder:", oldest_folder)

        print("Newest folder:", newest_folder)

        print("All folders sorted by modified time -- oldest to newest:", directory_list) 

    if len(file_list) == 0:

        print("there are no (regular) files in the given directory!")

    else:

        oldest_file = file_list[0]

        newest_file = file_list[-1]

        print("Oldest file:", oldest_file)

        print("Newest file:", newest_file)

        print("All (regular) files sorted by modified time -- oldest to newest:", file_list)

    if len(file_list) > 0 and len(directory_list) > 0:

        oldest = files[0]

        newest = files[-1]

        print("Oldest (file/folder):", oldest)

        print("Newest (file/folder):", newest)

        print("All (file/folder) sorted by modified time -- oldest to newest:", files)

See below for a pic of the code.

Saving IPython/Jupyter notebook as PDF on Ubuntu

When you would like to save your Jupyter notebook as a PDF file, and you encouter the following problems on Ubuntu OS. This post is for you.

The solution:

XeLatex is a part of texlive-xetex package.

To install on Ubuntu, run the following command: 

$ sudo apt-get install texlive-xetex

Now you can download  your ipynb file as PDF!

Import CSV using Pandas to Django models

This post introduces how to import CSV data using Pandas to Django models.

Python has a built-in csv library, but do not use that, it is not flexible for csv data that has both string and number based data. See the reasons below:

builtin csv module is very primitive at handling mixed data-types, does all its type conversion at import-time, and even at that has a very restrictive menu of options, which will mangle most real-world datasets (inconsistent quoting and escaping, missing or incomplete values in Booleans and factors, mismatched Unicode encoding resulting in phantom quote or escape characters inside fields, incomplete lines will cause exception). Fixing csv import is one of countless benefits of pandas. So, your ultimate answer is indeed stop using builtin csv import and start using pandas.

Do not import the data in csv file to Django models via row by row method– that is too slow.

Django (version > 1.4 )  provides  bulk_create as an object manager method which takes as input an array of objects created using the class constructor.

See my example code below:

import pandas as pd

df=pd.read_csv('test_csv.txt',sep=';')

#print(df)

row_iter = df.iterrows()

objs = [

    myClass_in_model(

        field_1 = row['Name'],

        field_2  = row['Description'],

        field_3  = row['Notes'],

        field_4  = row['Votes']

    )

    for index, row in row_iter

]

myClass_in_model.objects.bulk_create(objs)

#Note: myClass_in_model: the class (i.e., the table you want to populate data from csv) we defined in Django model.py
#Note: field_1 to filed_4 are the fields you defined in your Django model.

 

References:

Import csv data into django models

How to write a Pandas Dataframe to Django model

Django bulk_create function example

Changing strings to Floats in an imported .csv