Scientific Programming Lab

Data Science Master @University of Trento - AA 2020/21

Download:    PDF    EPUB    HTML    Github

Teaching assistant: David Leoni david.leoni@unitn.it website: davidleoni.it

This work is licensed under a Creative Commons Attribution 4.0 License CC-BY

cc-by jiu99

Timetable and lecture rooms

Due to the current situation regarding the Covid-19 pandemic, Practicals will take place ONLINE this year. They will be held on Mondays from 14:30 to 16:30 and on Wednesdays from 11:30 to 12:30.

Practicals will use the Zoom platform (https://zoom.us/) and the link for the connection will be published on the practical page available in this site a few minutes before the start of the session.

This first part of the course will tentatively run from Wednesday, September 23rd, 2020 to Monday, November 2nd, 2020.

Moodle

In the Moodle page of the course you can find announcements and videos of the lectures.

News

22 September 2020:

WARNING: Part A of this website is being gradually improved and transfered to en.softpython.org

Keep an eye an news to see what has been transfered so far. Whenever a page is moved, I will substitute the old page with a link.

12 September 2020:

Old news

Slides

See Slides page

Office hours

To schedule a meeting, see here

Labs timetable

For the regular labs timetable please see:

Tutoring

A tutoring service for Scientific Programming will be set up

Please take advantage of it as much as possible so you don’t end up writing random code at the exam!

TBD

Resources

Part A Resources

Part A Theory slides by Andrea Passerini

Allen Downey, Think Python

License: Creative Commons CC BY Non Commercial 3.0as reported in the original page

Tutorials from Nicola Cassetta

  • Tutorial step by step, in Italian, good for beginners. They are well done and with solutions - please try them all.

  • online

Dive into Python 3

Licence: Creative Commons By Share-alike 3.0 as reported at the bottom of book website

LeetCode

Website with collections of exercises sorted by difficulty and acceptance rate. You can generally try sorting by Acceptance and Easy filters.

leetcode.com

For a selection of exercises from leetcode, see Further resources sections at the ends of

HackerRank

Contains many Python 3 exercises on algorithms and data structures (Needs to login)

hackerrank.com

Geeks for Geeks

Contains many exercises - doesn’t have solutions nor explicit asserts but if you login and submit solutions, the system will run some tests serverside and give you a response.

In general for Part A you can filter difficulty by school+basic+easy and if you need to do part B also include medium.

Example: Filter difficulty by school+basic+easy and topic String

You can select many more topics if you click more>> un der Topic Tags:

immagine.png

Material from other courses of mine (in Italian)

Part B Resources

Editors

  • Visual Studio Code: the course official editor.

  • Spyder: Seems like a fine and simple editor

  • PyCharme Community Edition

  • Jupyter Notebook: Nice environment to execute Python commands and display results like graphs. Allows to include documentation in Markdown format

  • JupyterLab : next and much better version of Jupyter, although as of Sept 2018 is still in beta

  • PythonTutor, a visual virtual machine (very useful! can also be found in examples inside the book!)

Further readings

  • Rule based design by Lex Wedemeijer, Stef Joosten, Jaap van der woude: a very readable text on how to represent information using only binary relations with boolean matrices (not mandatory read, it only gives context and practical applications for some of the material on graphs presented during the course)

Exams

Exams dates: see Moodle

Past exams

Exam modalities

Make practice with the lab computers !!

Exam will be in Linux Ubuntu environment (even when online) - so learn how to browse folders there and if in presence also typing with noisy lab keyboards :-)

Sciprog exams are open book. You will only be given access to this documentation (if in presence you can also bring a printed version of the material listed below):

Expectations

This is a data science master, so you must learn to be a proficient programmer - no matter the background you have.

Exercises proposed during labs are an example of what you will get during the exam, BUT there is no way you can learn the required level of programming only doing exercises on this website. Fortunately, since Python is so trendy nowadays there are a zillion good resources to hone your skills - you can find some in Resources

To successfully pass the exam, you should be able to quickly solve exercises proposed during labs with difficulty ranging from ✪ to ✪✪✪ stars. By quickly I mean in half on hour you should be able to solve a three star exercise ✪✪✪. Typically, an exercise will be divided in two parts, the first easy ✪✪ to introduce you to the concept and the second more difficult ✪✪✪ to see if you really grasped the idea.

Before getting scared, keep in mind I’m most interested in your capability to understand the problem and find your way to the solution. In real life, junior programmers are often given by senior colleagues functions to implement based on specifications and possibly tests to make sure what they are implementing meets the specifications. Also, programmers copy code all of the time. This is why during the exam I give you tests for the functions to implement so you can quickly spot errors, and also let you use the course material (see exam modalities).

Part A expectations: performance does not matters: if you are able to run the required algorithm on your computer and the tests pass, it should be fine. Just be careful when given a 100Mb file, in that case sometimes bad code may lead to very slow execution and/or clog the memory.

In particular, in lab computers the whole system can even hang, so watch out for errors such as:

  • infinite while which keeps adding new elements to lists - whenever possible, prefer for loops

  • scanning a big pandas dataframe using a for in instead of pandas native transformations

Part B expectations: performance does matters (i.e. finding the diagonal of a matrix should take a time linearly proportional to \(n\), not \(n^2\)). Also, in this part we will deal with more complex data structures. Here we generally follow the Do It Yourself method, reimplementing things from scratch. So please, use the brain:

  • if the exercise is about sorting, do not call Python .sort() method !!!

  • if the exercise is about data structures, and you are thinking about converting the whole data structure (or part of it) into python lists, first, think about the computational cost of such conversion, and second, do ask the instructor for permission.

Grading

Taking part to an exam erases *any* vote you had before (except for Midterm B which of course doesn’t erase Midterm A taken in the same academic year)

Correct implementations: Correct implementations with the required complexity grant you full grade.

Partial implementations: Partial implementations might still give you a few points. If you just can’t solve an exercise, try to solve it at least for some subcase (i.e. array of fixed size 2) commenting why you did so.

When all tests pass hopefully should get full grade (although tests are never exhaustive!), but if the code is not correct you will still get a percentage. Percentage of course is subjective, and may depend on unfathomable factors such as the quantity of jam I found in the morning croissant that particular day. Jokes aside, the amount you get is usually proportional to the amount of time I have to spend to fix your algorithm.

After exams I publish the code with corrections. If all tests pass and you still don’t get 100% grade, you may come to my office questioning the grade. If tests don’t pass I’m less available for debating - I don’t like much complaints like ‘my colleague did the same error as me and got more points’ - even worse is complaining without having read the corrections.

Exams FAQ

First and foremost, I’m not the boss here, please refer to exam rules explained by Andrea Passerini slides.

I add here some further questions I sometimes receive - luckily, answers are pretty easy to remember.

I did good part A/B, can I only do part B/A on next exam?

No way.

Can I have additional retake just for me?

No way.

Can I have additional oral to increase the grade?

No way.

I have 7 + \(\sqrt{3}\) INF credits from a Summer School in Applied Calculonics, can I please give only Part B?

I’m not into credits engineering, please ask the administrative office or/and Passerini.

I have another request which does not concern corrections / possibly wrong grading

I’m not the boss, ask Passerini.

I’ve got 26.99 but this is my last exam and I really need 27 so I can get good master final outcome, could you please raise grade of just that little 0.01?

Preposterous requests like this will be forwarded to our T-800 assistent, it’s very efficient :

judgment-day

Exams How To

Make sure all exercises at least compile!

Don’t forget duplicated code around!

If I see duplicated code, I don’t know what to grade, I waste time, and you don’t want me angry while grading.

Only implementations of provided function signatures will be evaluated !!

For example, if you are given to implement:

def f(x):
    raise Exception("TODO implement me")

and you ship this code:

def my_f(x):
    # a super fast, correct and stylish implementation

def f(x):
    raise Exception("TODO implement me")

We will assess only the latter one f(x), and conclude it doesn’t work at all :P !!!!!!!

Helper functions

Still, you are allowed to define any extra helper function you might need. If your f(x) implementation calls some other function you defined like my_f here, it is ok:

# Not called by f, will get ignored:
def my_g(x):
    # bla

# Called by f, will be graded:
def my_f(y,z):
    # bla

def f(x):
    my_f(x,5)

How to edit and run

Look in Applications->Programming:

  • Part A: Jupyter: open Terminal and type jupyter notebook

  • Part B: open Visual Studio Code

If for whatever reason tests don’t work in Visual Studio Code, be prepared to run them in the Terminal.

PAY close attention to function comments!

DON’T modify function signatures! Just provide the implementation

DON’T change existing test methods. If you want, you can add tests

DON’T create other files. If you still do it, they won’t be evaluated

Debugging

If you need to print some debugging information, you are allowed to put extra print statements in the function bodies

Even if print statements are allowed, be careful with prints that might break your function! For example, avoid stuff like this:

x = 0
print(1/x)

Acknowledgements

  • I wish to thank Dr. Luca Bianco for the introductory material on Visual Studio Code and Python

  • This site was made with Jupyter using NBSphinx extension and Jupman template