Data Engineering Series
As a part of Data Engineering Series, we have already covered part-1(Data Engineering-Introduction) and part-2(Basic Python). As a continuity of previous post on Basic Python, we are going to see Advance Python Concepts in this post.
Magic Methods in Python
In Python, Magic methods in Python are the special methods that start and end with the double underscores
- Magic methods are not meant to be invoked directly by you, but the invocation happens internally from the class once certain action is performed
- Examples for magic methods are: __new__, __repr__, __init__, __add__, __len__, __del__ etc. The __init__ method used for initialization is invoked without any call
- Use the dir() function to see the number of magic methods inherited by a class
- The advantage of using Python’s magic methods is that they provide a simple way to make objects behave like built-in types
- Magic methods can be used to emulate the behavior of built-in types of user-defined objects. Therefore, whenever you find yourself trying to manipulate a user-defined object’s output in a Python class, then use magic methods.
Example :
v = 4
v.__add__(2)
Implementation —
# __Del__ methodfrom os.path import joinclass FileObject:def __init__(self, file_path='~', file_name='test.txt'):
self.file = open(join(file_path, file_name), 'rt')def __del__(self):
self.file.close()
del self.file
Implementation —
# __repr__ methodclass String:
def __init__(self, string):
self.string = stringdef __repr__(self):
return 'Object: {}'.format(self.string)
Inheritance and Polymorphism in Python
- In Python, Inheritance and Polymorphism are very powerful and important concept
- Using inheritance you can use or inherit all the data fields and methods available in the parent class
- On top of it, you can add you own methods and data fields
- Python allows multiple inheritance i.e you can inherit from multiple classes
- Inheritance provides a way to write better organized code and re-use the code
One of the best article I read on class inheritance by
Syntax —
class ParentClass:
Body of parent class
class DerivedClass(ParentClass):
Body of derived class
- In Python, Polymorphism allows us to define methods in the child class with the same name as defined in their parent class
Example —
class X:
def sample(self):
print(“sample() method from class X”)
class Y(X):
def sample(self):
print(“sample() method from class Y”)
Implementation —
# Inheritanceclass Vehicle:def __init__(self, name, color):
self.__name = name
self.__color = colordef getColor(self):
return self.__colordef setColor(self, color):
self.__color = colordef get_Name(self):
return self.__nameclass Bike(Vehicle):def __init__(self, name, color, model):
super().__init__(name, color) # call parent class
self.__model = modeldef get_details(self):
return self.get_Name() + self.__model + " in " +
self.getColor() + " color"b_obj = Bike("Cziar", "red", "TK720")
print(b_obj.get_details())
print(b_obj.get_Name())
Output —
Cziar TK720 in red color
Cziar
Implementation —
# Polymorphismfrom math import piclass Shape:
def __init__(self, name):
self.name = namedef area(self):
passclass Sqr(Shape):
def __init__(self, length):
super().__init__("Square")
self.length = lengthdef area(self):
return self.length**2class Circle(Shape):
def __init__(self, radius):
super().__init__("Circle")
self.radius = radiusdef area(self):
return pi*self.radius**2a = Square(6)
b = Circle(10)
print(a.area())
print(b.area())
Output —
36
314.1592653589793
Errors and Exception Handling in Python
In Python, an error can be a syntax error or an exception.
When the parser detects an incorrect statement, Syntax errors occur.
- Exceptions errors are raised when an external event occurs which in some way changes the normal flow of the program
- Exception error occurs whenever syntactically correct python code results in an error
- Python comes with various built-in exceptions as well as the user can create user-defined exceptions
- Garbage collection is the memory management feature i.e a process of cleaning shared computer memory
Some of python’s built in exceptions —
IndexError : When the wrong index of a list is retrieved
ImportError : When an imported module is not found
KeyError : When the key of the dictionary is not found
NameError: When the variable is not defined
MemoryError : When a program run out of memory
TypeError : When a function and operation is applied in an incorrect type
AssertionError : When assert statement fails
AttributeError : When an attribute assignment is failed
Try and Except in Python
In Python, exceptions can be handled using a try statement
- The block of code which can raise an exception is placed inside the try clause. The code that handles the exceptions is written in the except clause
- In case no exception has occurred, the except block is skipped and program normal flow continues
- A try clause can have any number of except clauses to handle different exceptions but only one will be executed in case the exception occurs
- We can also raise exceptions using the raise keyword
- The try statement in Python can have an optional finally clause which executes regardless of the result of the try- and except blocks
Example :
try:
print(a)
except:
print(“Something went wrong”)
finally:
print(“Exit”)
Implementation —
# try, except, finallytry:
print(1 / 0)
except:
print("Error occurred")finally:
print("Exit")
Output —
Error occurred
Exit
User-defined Exceptions
In Python, user can create his own error by creating a new exception class
- Exceptions need to be derived from the Exception class, either directly or indirectly
- Exceptions errors are raised when an external event occurs which in some way changes the normal flow of the program
- User defined exceptions can be implemented by raising an exception explicitly, by using assert statement or by defining custom classes for user defined exceptions
- Use assert statement to implement constraints on the program. When, the condition given in assert statement is not met, the program gives AssertionError in output
- You can raise an existing exception by using the raise keyword and the name of the exception
- To create a custom exception class and define an error message, you need to derive the errors from the Exception class directly
- When creating a module that can raise several distinct errors, a common practice is to create a base class for exceptions defined by that module, and subclass that to create specific exception classes for different error conditions, this is called Hierarchical custom exceptions
Example —
class class_name(Exception)
Implementation —
class Error(Exception):
passclass TooSmallValueError(Error):
passnumber = 100while True:
try:
num = int(input("Enter a number: "))
if num < number:
raise TooSmallValueError
break
except TooSmallValueError:
print("Value too small")
Output —
Enter a number: 40
Value too small
Garbage Collection in Python
In Python, Garbage collection is the memory management feature i.e a process of cleaning shared computer memory which is currently being put to use by a running program when that program no longer needs that memory and can be used other programs
- In python, Garbage collection works automatically. Hence, python provides with good memory management and prevents the wastage of memory
- In python, forcible garbage collection can be done by calling collect() function of the gc module
- In python, when there is no reference left to the object in that case it is automatically destroyed by the Garbage collector of python and __del__() method is executed
Example :
import gc
gc.collect()
Implementation —
#manual garbage collectionimport sys, gcdef test():
list = [18, 19, 20,34,78]
list.append(list)def main():
print("Garbage Creation")
for i in range(5):
test()print("Collecting..")
n = gc.collect()
print("Unreachable objects collected by GC:", n)
print("Uncollectable garbage list:", gc.garbage)if __name__ == "__main__":
main()
sys.exit()
Output —
Garbage Creation
Collecting..
Unreachable objects collected by GC: 33
Python Debugger
Debugging is the process of locating and solving the errors in the program. In python, pdb which is a part of Python’s standard library is used to debug the code
- pdb module internally makes used of bdb and cmd modules
- It supports setting breakpoints and single stepping at the source line level, inspection of stack frames, source code listing etc
Syntax —
import pdb
pdb.set_trace()
- To set the breakpoints, there is a built-in function called breakpoint()
Implementation —
import pdb
def multiply(a, b):
answer = a * b
return answer
pdb.set_trace()
a = int(input("Enter first number : "))
b = int(input("Enter second number : "))
sum = multiply(a, b)
Decorators in Python
In Python, a decorator is any callable Python object that is used to modify a function or a class. It takes a function, adds some functionality, and returns it.
- Decorators are a very powerful and useful tool in Python since it allows programmers to modify/control the behavior of function or class.
- In Decorators, functions are passed as an argument into another function and then called inside the wrapper function.
- Decorators are usually called before the definition of a function you want to decorate.
There are two different kinds of decorators in Python:
Function decorators
Class decorators
- When using Multiple Decorators to a single function, the decorators will be applied in the order they’ve been called
- By recalling that decorator function, we can re-use the decorator
Implementation —
#Decoratorsdef test_decorator(func):
def function_wrapper(x):
print("Before calling" + func.__name__)
res = func(x)
print(res)
print("After calling" + func.__name__)
return function_wrapper@test_decorator
def sqr(n):
return n**2
sqr(20)
Output —
Before callingsqr
400
After callingsqr
Implementation —
# Multiple Decoratorsdef lowercase_decorator(function):
def wrapper():
func= function()
make_lowercase = func.lower()
return make_lowercase
return wrapperdef split_string(function):
def wrapper():
func= function()
split_string =func.split()
return split_string
return wrapper@split_string
@lowercase_decoratordef test_func():
return 'MOTHER OF DRAGONS'
test_func()
Output —
['mother', 'of', 'dragons']
Memoization using Decorators
In Python, memoization is a technique which allows you to optimize a Python function by caching its output based on the parameters you supply to it.
- Once you memoize a function, it will only compute its output once for each set of parameters you call it with. Every call after the first will be quickly retrieved from a cache.
- If you want to speed up the parts in your program that are expensive, memoization can be a great technique to use.
One of the best article I read about Decorators by
There are three approaches to Memoization —
Using global
Using objects
Using default parameter
Using a Callable Class
Implementation —
#fibonacci series using Memoization using decoratorsdef memoization_func(t):
dict_one = {}
def h(z):
if z not in dict_one:
dict_one[z] = t(z)
return dict_one[z]
return h
@memoization_func
def fib(n):
if n == 0:
return 0
elif n == 1:
return 1
else:
return fib(n-1) + fib(n-2)print(fib(20))
Output —
6765
Defaultdict
In python, a dictionary is a container that holds key-value pairs. Keys must be unique, immutable objects
- If you try to access or modify keys that don’t exist in the dictionary, it raise a KeyError and break up your code execution. To tackle this issue, Python defaultdict type, a dictionary-like class is used
- If you try to access or modify a missing key, then defaultdict will automatically create the key and generate a default value for it
- A defaultdict will never raise a KeyError
- Any key that does not exist gets the value returned by the default factory
- Hence, whenever you need a dictionary, and each element’s value should start with a default value, use a defaultdict
Syntax —
from collections import defaultdict
demo = defaultdict(int)
Implementation —
from collections import defaultdict
default_dict_var = defaultdict(list)
for i in range(10):
default_dict_var[i].append(i)
print(default_dict_var)
Output —
defaultdict(<class 'list'>, {0: [0], 1: [1], 2: [2], 3: [3], 4: [4], 5: [5], 6: [6], 7: [7], 8: [8], 9: [9]})
OrderedDict
In python, OrderedDict is one of the high performance container datatypes and a subclass of dict object. It maintains the order in which the keys are inserted. In case of deletion or re-insertion of the key, the order is maintained and used when creating an iterator
- It’s a dictionary subclass that remembers the order in which its contents are added
- When the value of a specified key is changed, the ordering of keys will not change for the OrderedDict
- If an item is overwritten in the OrderedDict, it’s position is maintained
- OrderedDict popitem removes the items in FIFO order
- The reversed() function can be used with OrderedDict to iterate elements in the reverse order
- OrderedDict has a move_to_end() method to efficiently reposition an element to an endpoint
Example —
from collections import OrderedDict
my_dict = {‘Sunday’: 0, ‘Monday’: 1, ‘tuesday’: 2}
# creating ordered dict
ordered_dict = OrderedDict(my_dict)
Generators in Python
In Python, Generator functions act just like regular functions with just one difference that they use the Python yield keyword instead of return . A generator function is a function that returns an iterator A generator expression is an expression that also returns an iterator
- Generator objects are used either by calling the next method on the generator object or using the generator object in a “for in” loop.
- A return statement terminates a function entirely but a yield statement pauses the function saving all its states and later continues from there on successive calls.
- Generator expressions can be used as the function arguments. Just like list comprehensions, generator expressions allow you to quickly create a generator object within minutes with just a few lines of code.
- The major difference between a list comprehension and a generator expression is that a list comprehension produces the entire list while the generator expression produces one item at a time as lazy evaluation. For this reason, compared to a list comprehension, a generator expression is much more memory efficient
Example —
def generator():
yield “x”
yield “y”
for i in generator():
print(i)
Implementation —
def test_sequence():
num = 0
while num<10:
yield num
num += 1
for i in test_sequence():
print(i, end=",")
Output —
0,1,2,3,4,5,6,7,8,9,
Implementation —
# Python generator with Loop#Reverse a string
def reverse_str(test_str):
length = len(test_str)
for i in range(length - 1, -1, -1):
yield test_str[i]
for char in reverse_str("Trojan"):
print(char,end =" ")
Output —
n a j o r T
Implementation —
# Generator Expression
# Initialize the list
test_list = [1, 3, 6, 10]
# list comprehension
list_comprehension = [x**3 for x in test_list]
# generator expression
test_generator = (x**3 for x in test_list)
print(list_comprehension)
print(type(test_generator))
print(tuple(test_generator))
Output —
[1, 27, 216, 1000]
<class 'generator'>
(1, 27, 216, 1000)
Coroutine in Python
- Coroutines are computer program components that generalize subroutines for non-preemptive multitasking, by allowing execution to be suspended and resumed
- Because coroutines can pause and resume execution context, they’re well suited to concurrent processing
- Coroutines are a special type of function that yield control over to the caller, but does not end its context in the process, instead maintaining it in an idle state
- Using coroutines the yield directive can also be used on the right-hand side of an = operator to signify it will accept a value at that point in time.
Example —
def func():
print(“My first Coroutine”)
while True:
var = (yield)
print(var)
coroutine = func()
next(coroutine)
Implementation —
def func():
print("My first Coroutine")
while True:
var = (yield)
print(var)
coroutine = func()
next(coroutine)
Output —
My first Coroutine