Scientific Programming Lab
Data Science Master @University of Trento - AA 2021/22
DOWNLOAD: PDF EPUB HTML GithubTeaching assistant: David Leoni david.leoni@unitn.it website: davidleoni.it
This work is licensed under a Creative Commons Attribution 4.0 License CC-BY
Timetable and lecture rooms
DS Labs:
Thursdays 15.30-17.30 room B106
Fridays 17.30-19.30 room A207
This year practicals will take place in person. This first part of the course will run from Thursday, September 23rd
Tutoring: by Gabriele Masina (gabriele.masina (guess what) studenti.unitn.it)
starting from Monday 15 november until Wednesday 15 december (included):
Mondays: 9:30-11:30 A202
Wednesdays: 9:30-11:30 A202
Complete timetable:
Part A: Andrea Passerini’s course site
Part B: Erik Dassi’s course material is on Moodle
Moodle: In the Moodle page of the course you can find announcements and your repl links.
Lab slides
News
Wed 31, Aug 2022: Published exam solutions
Tue 12, Jul 2022: Published exam solutions
Office hours
To schedule a meeting, see here
References
Part A References
Part A Theory slides by Andrea Passerini
Part B References page
Editors
Visual Studio Code: the course official editor.
Spyder: Seems like a fine and simple editor
Jupyter Notebook: Nice environment to execute Python commands and display results like graphs. Allows to include documentation in Markdown format
JupyterLab : next and much better version of Jupyter, although as of Sept 2018 is still in beta
PythonTutor, a visual virtual machine (very useful! can also be found in examples inside the book!)
Exams
Past exams
Exam modalities
Make practice with the lab computers !!
Exam will be in Linux Ubuntu environment (even when online) - so learn how to browse folders there and if in presence also typing with noisy lab keyboards :-)
Exams are open book: You will only be given online access to this documentation (you can’t bring printed notes):
Andrea Passerini slides and Erik Dassi slides
-
In particular, Unittest docs
If you need to look up some Python function, please start today learning how to search documentation on Python website.
Part A: Think Python book
Part B: Problem Solving with Algorithms and Data Structures using Python book
Expectations
This is a data science master, so you must learn to be a proficient programmer - no matter the background you have.
Exercises proposed during labs are an example of what you will get during the exam, BUT there is no way you can learn the required level of programming only doing exercises on this website or softpython. Fortunately, since Python is so trendy nowadays there are a zillion good resources to hone your skills - you can find some in References
To successfully pass the exam, you should be able to quickly solve exercises proposed during labs with difficulty ranging from ✪ to ✪✪✪ stars. By quickly I mean in half on hour you should be able to solve a three star exercise ✪✪✪.
Before getting scared, keep in mind I’m most interested in your capability to understand the problem and find your way to the solution. In real life, junior programmers are often given by senior colleagues functions to implement based on specifications and possibly tests to make sure what they are implementing meets the specifications. Also, programmers copy code all of the time. This is why during the exam I give you tests for the functions to implement so you can quickly spot errors, and also let you use the course material (see exam modalities).
Part A expectations: performance does not matters: if you are able to run the required algorithm on your computer and the tests pass, it should be fine. Just be careful when given a 100Mb file, in that case sometimes bad code may lead to very slow execution and/or clog the memory.
In particular, in lab computers the whole system can even hang, so watch out for errors such as:
infinite
while
which keeps adding new elements to lists - whenever possible, preferfor
loopsscanning a big pandas dataframe using a
for in
instead of pandas native transformations
Part B expectations: performance does matters (i.e. finding the diagonal of a matrix should take a time linearly proportional to \(n\), not \(n^2\)). Also, in this part we will deal with more complex data structures. Here we generally follow the Do It Yourself method, reimplementing things from scratch. So please, use the brain:
if the exercise is about sorting, do not call Python
.sort()
method !!!if the exercise is about data structures, and you are thinking about converting the whole data structure (or part of it) into python lists, first, think about the computational cost of such conversion, and second, do ask the instructor for permission.
Grading
Taking part to an exam erases *any* vote you had before (except for Midterm B which of course doesn’t erase Midterm A taken in the same academic year)
Correct implementations: Correct implementations with the required complexity grant you full grade.
Partial implementations: Partial implementations might still give you a few points. If you just can’t solve an exercise, try to solve it at least for some subcase (i.e. array of fixed size 2) commenting why you did so.
When all tests pass hopefully should get full grade (although tests are never exhaustive!), but if the code is not correct you will still get a percentage. Percentage of course is subjective, and may depend on unfathomable factors such as the quantity of jam I found in the morning croissant that particular day. Jokes aside, the percentage you get is usually inversely proportional to the amount of time I spend fixing your algorithm.
After exams I publish the code with corrections. If all tests pass and you still don’t get 100% grade, you may come to my office questioning the grade. If tests don’t pass I’m less available for debating - I don’t like much complaints like ‘my colleague did the same error as me and got more points’ - even worse is complaining without having read the corrections.
Exams FAQ
First and foremost, I’m not the boss here, please refer to exam rules explained by Andrea Passerini slides.
I add here some further questions I sometimes receive - luckily, answers are pretty easy to remember.
Can I have additional retake just for me?
No way.
Can I have additional oral to increase the grade?
No way.
I have \(\pi + \sqrt{7}\) INF credits from a Summer School in Applied Calculonics, can I please give only Part B?
I’m not into credits engineering, please ask the administrative office or/and Passerini.
I have another request which does not concern corrections / possibly wrong grading
I’m not the boss, ask Passerini.
I’ve got 26.99 but this is my last exam and I really need 27 so I can get good master final outcome, could you please raise the grade of just that little 0.01?
Preposterous requests like this will be forwarded to our T-800 assistent, it’s very efficient
Exams How To
Make sure all exercises at least compile!
Don’t forget duplicated code around!
If I see duplicated code, I don’t know what to grade, I waste time, and you don’t want me angry while grading.
Only implementations of provided function signatures will be evaluated !!
For example, if you are given to implement:
def f(x):
raise Exception("TODO implement me")
and you ship this code:
def my_f(x):
# a super fast, correct and stylish implementation
def f(x):
raise Exception("TODO implement me")
We will assess only the latter one f(x)
, and conclude it doesn’t work at all :P !!!!!!!
Helper functions
Still, you are allowed to define any extra helper function you might need. If your f(x)
implementation calls some other function you defined like my_f
here, it is ok:
# Not called by f, will get ignored:
def my_g(x):
# bla
# Called by f, will be graded:
def my_f(y,z):
# bla
def f(x):
my_f(x,5)
How to edit and run
Look in Applications->Programming:
Part A: Jupyter: open
Terminal
and typejupyter notebook
Part B: open Visual Studio Code
If for whatever reason tests don’t work in Visual Studio Code, be prepared to run them in the Terminal.
PAY close attention to function comments!
DON’T modify function signatures! Just provide the implementation
DON’T change existing test methods. If you want, you can add tests
DON’T create other files. If you still do it, they won’t be evaluated
Debugging
If you need to print some debugging information, you are allowed to put extra print statements in the function bodies
Even if print statements are allowed, be careful with prints that might break your function! For example, avoid stuff like this:
x = 0
print(1/x)
Acknowledgements
This website and related courses were funded mainly by Department of Information Engineering and Computer Science (DISI), University of Trento, and also Mathematics and CIBIO departments.
I wish also to thank Dr. Luca Bianco for the introductory material on Visual Studio Code and Python, and Dr. Alberto Montresor for having introduced me to first labs and slides on graphs.
All the material in this website is distributed with license CC-BY 4.0 International Attribution creativecommons.org/licenses/by/4.0/deed.en Basically, you can freely redistribute and modify the content, just remember to cite Universit of Trento and the present author.
Technical notes: all website pages are easily modifiable Jupyter notebooks, that were converted to web pages using NBSphinx using Jupman template. Text sources are on Github: github.com/DavidLeoni/sciprog-ds