Have you ever wondered- How the memory is managed in Python?
Or what is the reference count in Python?
In this article, you can expect detail about the following topics…
Without wasting any further time, let’s start point by point…
Table of Contents
For the sake of simplicity, the reference count is nothing but the number of times a Python object is used.
How is the reference count calculated?
Python getrefcount()
is the function present inbuilt with the Python module sys
. This function takes a Python object as an input and returns the number of references present for the given Python object.
Here, input to the getrefcount()
can be a variable name, value, function, class, and anything else that comes under a Python object.
Let’s take an example…
import sys print(sys.getrefcount(1556778))
Output:
3
This means the integer value ‘1556778’ is used 3 times.
You might be curious… how does it come 3 times, even if you have used the value only once?
The reference count is calculated based on the two factors…
Let’s bend into some technical detail…
When you run any Python program, it gets interpreted into the bytecode. The reference count of the object is calculated based on the number of times an object is used in the bytecode (not from your high-level program code).
You can also check the bytecode of your program using the dis module. It disassembles the Python bytecode.
Below is the code to get the bytecode of the Python program.
import dis import sys print(compile("sys.getrefcount(1556778)", '', 'single').co_consts) print(dis.dis(compile("sys.getrefcount(1556778)", '', 'single'))) print(sys.getrefcount(1556778))
Output:
(1556778, None) 1 0 LOAD_NAME 0 (sys) 3 LOAD_ATTR 1 (getrefcount) 6 LOAD_CONST 0 (1556778) 9 CALL_FUNCTION 1 12 PRINT_EXPR 13 LOAD_CONST 1 (None) 16 RETURN_VALUE None 3
Here, single
is a mode of Python interpreter.
There are 3 references here- one from the co_consts
tuple on the code object, one on the stack (from the LOAD_CONST
instruction), and one for the sys.getrefcount()
method itself.
If the same object is used in the other part of the code, it will be counted in the reference count of the given object.
Even, there are multiple cumbersome operations running in background Python. It may be possible that this object is used in the background of your running program. It is also counted as a reference to the object.
The output (reference count) may vary from system to system.
When you pass the variable as a parameter to the function, the reference count for the variable object is incremented. When the control goes out of the function, the reference count is decremented.
import sys a =10 print(sys.getrefcount(a)) #17 def func(b): print(sys.getrefcount(a)) #19 func(a) print(sys.getrefcount(a)) #17
Note: Reference count is shown as 19 instead of 18 because variable ‘a’ is used two times in function- as a parameter to the function func()
; as a parameter to the function sys.getrefcount()
.
Along with the list object, every element in the list has a separate reference count.
When you delete a list or if the lifetime of the list expires, the reference count of each element in the list goes down by one.
import sys liAbc = ['a', 'b', 'c'] print(sys.getrefcount('a')) #14 print(sys.getrefcount('b')) #12 print(sys.getrefcount('c')) #23 del liAbc print(sys.getrefcount('a')) #13 print(sys.getrefcount('b')) #11 print(sys.getrefcount('c')) #22
For more details about the list, you can read Python list vs tuple.
Python uses dynamic memory allocation. While declaring a variable object, you don’t need to explicitly allocate the memory. When the object is no longer used in the program, the variable is deleted.
There are two questions that arise…
And, here comes the use of reference count.
How does Python count references used for Memory Management?
Python counts the reference for each object. When you use that object again, the reference count is incremented.
When the reference object comes out of scope, the reference count is decremented.
When the reference count reaches zero, means the Python object is not in use. The memory which is assigned to the object gets deleted.
The integer is one of the numeric data types in Python.
When you create an integer object, the value of the object is saved in memory to use in the program. The reference count is set.
When you assign the same integer value to another variable, the reference count increases.
It also saves computing resources by using a single place to store the value and assign to all the variables storing the same value in the program.
And we know that an integer is an immutable datatype in Python. So we can not change the value of the integer. The new value is stored in a different memory with the new reference count.
import sys print(sys.getrefcount(55)) #4 var = 55 print sys.getrefcount(55) #5 var = var + 1 print sys.getrefcount(55) #4
In the above program, the value of the variable var is incremented (you can change it to any other value or delete the variable). As an Integer is immutable, we can not update the integer value, instead, it stores at a different place and decrements the reference count of the previous value by one.
Now, what if you use a smaller integer value?
import sys print(sys.getrefcount(1)) #97 print(sys.getrefcount(2)) #76 print(sys.getrefcount(3)) #30
This means integer value 1 is used 97 times, 2 is used 76, and 3 is used 30 times.
There are multiple cumbersome operations that go running on a Python background. So these values are used. The output may vary from system to system.
To find the pattern for the number of times a Python object is used, we can plot the graph for a range of input objects.
import sys import matplotlib.pyplot as plt #calculate the values for x and y axis x = range(500) y = [sys.getrefcount(i) for i in x] fig, ax = plt.subplots() plt.plot(x, y, '.') #set lable for x axis ax.set_xlabel("number") #set lable for y axis ax.set_ylabel("sys.getrefcount(number)") #plot the graph plt.show()
From the graphs, it is clear that there are more numbers of reference counts for smaller numbers. A couple of initial smaller values have a reference count of more than 3000. This means, smaller numbers are used widely running Python in the background.
import sys import matplotlib.pyplot as plt #string with all character letters strLet = "abcdefghijklmnopqrstuvwxyz" refs = [sys.getrefcount(l) for l in strLet] y_pos = range(len(strLet)) plt.bar(y_pos, refs, align='center') plt.xticks(y_pos, letters) #set lable for x axis plt.xlabel("letter") #set lable for y axis plt.ylabel('sys.getrefcount(strLet)') #plot the graph plt.show()
We are more obsessed with the letter ‘x’ and it is used for many variable declarations. If you look at the graph it holds true. The ‘x’ as the object is used more than 1100 in Python.
As Python is a case-sensitive language, you will get the difference reference count (so the different graph) for small and big caps letters.
How can we exclude “Python” itself?
import sys for w in ["python", "version", "error", "var", "reference"]: print w, sys.getrefcount(w)
('python', 6) ('version', 12) ('error', 47) ('var', 9) ('reference', 6)
You can try some other keywords as well.
sys.getrefcount()
itself.That’s all!
Understanding reference count is very important for memory management. If you find this article fruitful, kindly share it with your friends.
I have tried to address answers to multiple daunting questions about Python getrefcount()
function, reference count, and memory management. If you have any doubts, feel free to write in the comment section.