Scientific Programming Lab¶

Data Science Master @University of Trento - AA 2021/22

Teaching assistant: David Leoni david.leoni@unitn.it website: davidleoni.it

Timetable and lecture rooms¶

DS Labs:

• Thursdays 15.30-17.30 room B106

• Fridays 17.30-19.30 room A207

This year practicals will take place in person. This first part of the course will run from Thursday, September 23rd

Complete timetable:

• Part B: Erik Dassi’s course site TBD

Moodle: In the Moodle page of the course you can find announcements and your repl links.

News¶

24 Friday September lab is cancelled. Regular meetings will start again from Thursday 30 September.

6 September 2021: Published exam solutions

12 July 2021: Published exam solutions

11 June 2021: Published exam solutions

12 September 2020:

Old news

Office hours¶

To schedule a meeting, see here

TBD

References¶

Editors¶

• Visual Studio Code: the course official editor.

• Spyder: Seems like a fine and simple editor

• PyCharme Community Edition

• Jupyter Notebook: Nice environment to execute Python commands and display results like graphs. Allows to include documentation in Markdown format

• JupyterLab : next and much better version of Jupyter, although as of Sept 2018 is still in beta

• PythonTutor, a visual virtual machine (very useful! can also be found in examples inside the book!)

• Rule based design by Lex Wedemeijer, Stef Joosten, Jaap van der woude: a very readable text on how to represent information using only binary relations with boolean matrices (not mandatory read, it only gives context and practical applications for some of the material on graphs presented during the course)

Exams¶

Exams dates: see Moodle TBD

Exam modalities¶

Make practice with the lab computers !!

Exam will be in Linux Ubuntu environment (even when online) - so learn how to browse folders there and if in presence also typing with noisy lab keyboards :-)

Sciprog exams are open book. You will only be given access to this documentation (if in presence you can also bring a printed version of the material listed below):

Expectations¶

This is a data science master, so you must learn to be a proficient programmer - no matter the background you have.

Exercises proposed during labs are an example of what you will get during the exam, BUT there is no way you can learn the required level of programming only doing exercises on this website or softpython. Fortunately, since Python is so trendy nowadays there are a zillion good resources to hone your skills - you can find some in References

To successfully pass the exam, you should be able to quickly solve exercises proposed during labs with difficulty ranging from ✪ to ✪✪✪ stars. By quickly I mean in half on hour you should be able to solve a three star exercise ✪✪✪.

Before getting scared, keep in mind I’m most interested in your capability to understand the problem and find your way to the solution. In real life, junior programmers are often given by senior colleagues functions to implement based on specifications and possibly tests to make sure what they are implementing meets the specifications. Also, programmers copy code all of the time. This is why during the exam I give you tests for the functions to implement so you can quickly spot errors, and also let you use the course material (see exam modalities).

Part A expectations: performance does not matters: if you are able to run the required algorithm on your computer and the tests pass, it should be fine. Just be careful when given a 100Mb file, in that case sometimes bad code may lead to very slow execution and/or clog the memory.

In particular, in lab computers the whole system can even hang, so watch out for errors such as:

• infinite while which keeps adding new elements to lists - whenever possible, prefer for loops

• scanning a big pandas dataframe using a for in instead of pandas native transformations

Part B expectations: performance does matters (i.e. finding the diagonal of a matrix should take a time linearly proportional to $$n$$, not $$n^2$$). Also, in this part we will deal with more complex data structures. Here we generally follow the Do It Yourself method, reimplementing things from scratch. So please, use the brain:

• if the exercise is about sorting, do not call Python .sort() method !!!

• if the exercise is about data structures, and you are thinking about converting the whole data structure (or part of it) into python lists, first, think about the computational cost of such conversion, and second, do ask the instructor for permission.

Taking part to an exam erases *any* vote you had before (except for Midterm B which of course doesn’t erase Midterm A taken in the same academic year)

Correct implementations: Correct implementations with the required complexity grant you full grade.

Partial implementations: Partial implementations might still give you a few points. If you just can’t solve an exercise, try to solve it at least for some subcase (i.e. array of fixed size 2) commenting why you did so.

When all tests pass hopefully should get full grade (although tests are never exhaustive!), but if the code is not correct you will still get a percentage. Percentage of course is subjective, and may depend on unfathomable factors such as the quantity of jam I found in the morning croissant that particular day. Jokes aside, the percentage you get is usually inversely proportional to the amount of time I spend fixing your algorithm.

After exams I publish the code with corrections. If all tests pass and you still don’t get 100% grade, you may come to my office questioning the grade. If tests don’t pass I’m less available for debating - I don’t like much complaints like ‘my colleague did the same error as me and got more points’ - even worse is complaining without having read the corrections.

Exams FAQ¶

First and foremost, I’m not the boss here, please refer to exam rules explained by Andrea Passerini slides.

I add here some further questions I sometimes receive - luckily, answers are pretty easy to remember.

I did good part A/B, can I only do part B/A on next exam?

No way.

Can I have additional retake just for me?

No way.

Can I have additional oral to increase the grade?

No way.

I have $$\pi + \sqrt{7}$$ INF credits from a Summer School in Applied Calculonics, can I please give only Part B?

I’m not into credits engineering, please ask the administrative office or/and Passerini.

I have another request which does not concern corrections / possibly wrong grading

I’m not the boss, ask Passerini.

I’ve got 26.99 but this is my last exam and I really need 27 so I can get good master final outcome, could you please raise the grade of just that little 0.01?

Preposterous requests like this will be forwarded to our T-800 assistent, it’s very efficient

Exams How To¶

Make sure all exercises at least compile!

Don’t forget duplicated code around!

If I see duplicated code, I don’t know what to grade, I waste time, and you don’t want me angry while grading.

Only implementations of provided function signatures will be evaluated !!

For example, if you are given to implement:

def f(x):
raise Exception("TODO implement me")


and you ship this code:

def my_f(x):
# a super fast, correct and stylish implementation

def f(x):
raise Exception("TODO implement me")


We will assess only the latter one f(x), and conclude it doesn’t work at all :P !!!!!!!

Helper functions

Still, you are allowed to define any extra helper function you might need. If your f(x) implementation calls some other function you defined like my_f here, it is ok:

# Not called by f, will get ignored:
def my_g(x):
# bla

# Called by f, will be graded:
def my_f(y,z):
# bla

def f(x):
my_f(x,5)


How to edit and run¶

Look in Applications->Programming:

• Part A: Jupyter: open Terminal and type jupyter notebook

• Part B: open Visual Studio Code

If for whatever reason tests don’t work in Visual Studio Code, be prepared to run them in the Terminal.

PAY close attention to function comments!

DON’T modify function signatures! Just provide the implementation

DON’T change existing test methods. If you want, you can add tests

DON’T create other files. If you still do it, they won’t be evaluated

Debugging¶

If you need to print some debugging information, you are allowed to put extra print statements in the function bodies

Even if print statements are allowed, be careful with prints that might break your function! For example, avoid stuff like this:

x = 0
print(1/x)


Acknowledgements¶

This website and related courses were funded mainly by Department of Information Engineering and Computer Science (DISI), University of Trento, and also Mathematics and CIBIO departments.

I wish also to thank Dr. Luca Bianco for the introductory material on Visual Studio Code and Python, and Dr. Alberto Montresor for having introduced me to first labs and slides on graphs.

All the material in this website is distributed with license CC-BY 4.0 International Attribution creativecommons.org/licenses/by/4.0/deed.en

Basically, you can freely redistribute and modify the content, just remember to cite University of Trento and the present author.

Technical notes: all website pages are easily modifiable Jupyter notebooks, that were converted to web pages using NBSphinx using template Jupman. Text sources are on Github at address github.com/DavidLeoni/sciprog-ds