2025 Mar 6^th – UQ PUG 16

Welcome to UQ Python User Group! Check out our general info for how we work and what we do. Below you’ll find the details of this month’s gathering.

Overview

Welcome to the second PUG for the year! This month we’ll look go through the most popular Python packages (by download) and explore their most useful features.

Structure

We start today by adding our names to the table below
Add your questions to this page
This month’s presentation
Finally, we spend the rest of the session answering the questions you’ve brought!

Teams

We have moved from a mailing list to a Teams channel! Opt-in to receive PUG updates

How to use this document

This is a Jupyter Notebook, a document format where everything is separated into cells. Each cell contains either markdown, or python.

The markdown cells allow you to write formatted text
The python cells allow you to write AND run python

A few tips

Double click on a cell to edit it
CTRL + ENTER to run a cell
Press + Code or + Markdown in the top menu to create new cells. Alternatively, press the buttons between cells

Training Resources

We offer Python training sessions and resources, you can find our introductory guide here.

Introduce yourself

What’s your name?	Where are you from?	Why are you here?
Cameron	UQ Library & SMP	To share and learn
Juddi	School of Communication and Arts	-
Duncan	SPPS	To learn more about Python
Elizabeth	-	-
John	-	-
Titus	-	-
Krystie	-	-

Questions

If you have any Python questions you’d like to explore with the group, please put your question and name in the sections below.

If you think you can help, feel free to contribute to the answers section!

Automation packages | Duncan

Do you have any ideas about which popular packages are useful for automation in windows?

# To interact with the filesystem
import os

# For copying/renaming/more complex operations
import shutil

# To interact with spreadsheets
import pandas as pd

# Folder full of spreadsheets
path = r"..\Demonstrations\top_modules\spreadsheets"

# Returns a list of files at the specified folder
files = os.listdir(path)

# Looping will automate for you
for file in files:
    full_path = os.path.join(path, file)

    # Read in any file (using a 'with' means the file automatically closes)
    with open(full_path) as f:
        contents = f.read()

    # Read in dataframes (e.g. .csv, .xlsx, see pd.read_... for the possibilities)
    df = pd.read_csv(full_path)


print(f"Using pandas' read_csv() function: \n\n{df}")
print(f"\nUsing base python's open() and .read() functions: \n\n{contents[0:200]}\n...")

Using pandas' read_csv() function: 

       download_count        project
0          1346601849          boto3
1           670262655        urllib3
2           609967137       botocore
3           576250793       requests
4           563169502        certifi
...               ...            ...
14995           25173        chilkat
14996           25169       aioradio
14997           25165        binilla
14998           25160   testarchiver
14999           25159  starlette-wtf

[15000 rows x 2 columns]

Using base python's open() and .read() functions: 

download_count,project
1346601849,boto3
670262655,urllib3
609967137,botocore
576250793,requests
563169502,certifi
533963799,charset-normalizer
515259652,setuptools
512857052,idna
492528430,grpcio-stat
...

Answers

See code above
Answer 2
…

Matching company names | Elizabeth

Match names from data list

# Before you start trying to use Python, try to write down by hand what the 
# algorithm would be.


sentence = "This is a long sentence with lots of words like Google, Rio and Tinto."
print(sentence)

# Single string, single keyword:
keyword = "Google"
matched = keyword in sentence

print()
print("Single keyword")
print("Is", keyword, "in my sentence?", matched)

# Single string, multiple keywords
keywords = ["Google", "Amazon"]

print()
print("Multiple keywords")

for company in keywords:
    matched = company in sentence
    print("Is", company, "in my sentence?", matched)


# Single string, multiple companies with multiple keywords (note the nested keywords for RioTinto)
keywords = [["Google"], ["Amazon"], ["Rio-Tinto", "Rio", "Tinto", "RioTinto", "Rio Tinto"]]

print()
print("Multiple companies with multiple keywords")
for company in keywords:
    for variant in company:  
        matched = variant in sentence

        if matched == True:
            break
    
    print("Is", company[0], "in my sentence?", matched)

This is a long sentence with lots of words like Google, Rio and Tinto.

Single keyword
Is Google in my sentence? True

Multiple keywords
Is Google in my sentence? True
Is Amazon in my sentence? False

Multiple companies with multiple keywords
Is Google in my sentence? True
Is Amazon in my sentence? False
Is Rio-Tinto in my sentence? True

# Using string data
import pandas as pd

df = pd.read_csv("../Demonstrations/top_modules/spreadsheets/top-pypi-packages.csv")

# The column "project" is a string column

# Let's make a list of keywords.
keywords = ["boto", "py"]

# To check which entries match which keywords
for keyword in keywords:
    matches = df["project"].str.contains(keyword)
    subset_with_keyword = df[matches == True]

    print("The rows with",keyword, "are")
    print(subset_with_keyword)

The rows with boto are
       download_count                             project
0          1346601849                               boto3
2           609967137                            botocore
11          454054215                         aiobotocore
554          16054951                         boto3-stubs
566          15361139                      botocore-stubs
...               ...                                 ...
14971           25243      botocore-a-la-carte-greengrass
14974           25236  botocore-a-la-carte-workspaces-web
14977           25219       types-aiobotocore-securityhub
14987           25185  types-aiobotocore-servicediscovery
14991           25182     botocore-a-la-carte-iotsitewise

[533 rows x 2 columns]
The rows with py are
       download_count                        project
12          453387776                python-dateutil
15          381826344                          numpy
16          360961023                         pyyaml
23          283246117                       pydantic
24          282364469                      pycparser
...               ...                            ...
14969           25252                pytest-splinter
14972           25242    tencentcloud-sdk-python-acp
14973           25241                        xrpl-py
14984           25203               opening-hours-py
14994           25174  tencentcloud-sdk-python-hasim

[3275 rows x 2 columns]

Answers

Answer 1
Answer 2
…

Question 3: Is it possible using Phyton for Facebook data scraping? Particularly related to Posts and their interactions on public Facebook groups | Juddi

Add more details here, then press CTRL + ENTER when you’re done

# Put any code you'd like to run here!


# YOU COULD RUN INTO ISSUES AND GET BANNED WITH REQUESTS
import requests

r = requests.get("https://www.facebook.com/groups/553828734993634/")

data = r.content

# MAYBE TRY SELENIUM (or the Python module)

# Talk to Sam Hames

Answers

Answer 1
Answer 2
…

Answers

Answer 1
Answer 2
…

Question 5: Question | Name

Add more details here, then press CTRL + ENTER when you’re done

# Put any code you'd like to run here!

Answers

Answer 1
Answer 2
…