Internet Explorer 7, 8, and 9 are no longer supported. Please use a newer browser.
Concourse works best with JavaScript enabled.
San Jose State University logo

College of Professional and Global Education · School of Information

Seminar in Information Science - Problem solving with data-part two
INFO 287

  • Spring 2023
  • Section 15
  • 2 Unit(s)
  • 03/13/2023 to 05/15/2023
  • Modified 05/22/2023

Canvas Information: Courses will be available March 13th, 6 am PT.

You will be enrolled in the Canvas site automatically.

Contact Information

Dr. Souvick Ghosh
Email
Office Location: Virtual/Online
Office Hours: Virtually (by appointment) via telephone or online

Course Description and Requisites

This course offers an advanced understanding of data science through the application of machine and deep learning techniques to real-life data problems. The course covers the development of tool-based and coding-based solutions to analyze and visualize data. It will provide advanced knowledge in multiple machine learning tools and platforms, data processing techniques, and evaluation.

Requisites

INFO 200, students will be expected to know HTML/CSS (as taught in INFO 240) or obtained via work experience.

Classroom Protocols

Expectations

Students are expected to participate fully in all class activities. It is expected that students will be open-minded and participate fully in discussions in class and debate in a mature and respectful manner. Use of derogatory, condescending, or offensive language including profanity is prohibited. Disagreement is healthy and perfectly acceptable. Expressing disagreement should always include an explanation of your reasoning and, whenever possible, evidence to support your position. In accordance with San José State University's Policies, the Student Code of Conduct, and applicable state and federal laws, discrimination based on gender, gender identity, gender expression, race, nationality, ethnicity, religion, sexual orientation, or disability is prohibited in any form.

Program Information

Course Workload

Success in this course is based on the expectation that students will spend, for each unit of credit, a minimum of forty-five hours over the length of the course (normally 3 hours per unit per week with 1 of the hours used for lecture) for instruction or preparation/studying or course related activities including but not limited to internships, labs, clinical practica. Other course structures will have equivalent workload expectations as described in the syllabus.

Instructional time may include but is not limited to:
Working on posted modules or lessons prepared by the instructor; discussion forum interactions with the instructor and/or other students; making presentations and getting feedback from the instructor; attending office hours or other synchronous sessions with the instructor.

Student time outside of class:
In any seven-day period, a student is expected to be academically engaged through submitting an academic assignment; taking an exam or an interactive tutorial, or computer-assisted instruction; building websites, blogs, databases, social media presentations; attending a study group; contributing to an academic online discussion; writing papers; reading articles; conducting research; engaging in small group work.

Course Goals

Core Competencies (Program Learning Outcomes) Supported

INFO 287 supports the following core competencies:

  1. G Demonstrate understanding of basic principles and standards involved in organizing information such as classification and controlled vocabulary systems, cataloging systems, metadata schemas or other systems for making information accessible to a particular clientele.
  2. H Demonstrate proficiency in identifying, using, and evaluating current and emerging information and communication technologies.

Course Learning Outcomes (CLOs)

Upon successful completion of the course, students will be able to:

  1. Understand the theory behind various machine learning algorithms and their application to real-world data problems.
  2. Collect real-world data using APIs, or from databases, then clean and process the data to make it useful for analysis.
  3. Analyze and compare machine learning tools and algorithms and apply them suitably for solving problems.
  4. Build simple deep neural models for prediction tasks.
  5. Create visualizations to explore and analyze data.
  6. Understand the importance of transparency and fairness in machine learning.

Course Materials

Textbooks

Recommended Textbooks:

  • Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press. Available free Online
  • Jamsa, K. (2020). Introduction to data mining and analytics. Jones & Bartlett Learning. Available through Amazon: 1284180905
  • Mohri, M., Rostamizadeh, A., & Talwalkar, A. (2018). Foundations of machine learning. MIT press. Available Free Online
  • Severance, C. (2016). Python for everyone: Exploring data in Python 3. Create Space Independent Publishing Platform. Available through Amazon: 1530051126
  • Shah, C. (2020). A hands-on introduction to data science. Cambridge University Press. Available through Amazon: 1108472443
  • Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding machine learning: From theory to algorithms. Cambridge University Press. Available Free Online
  • Zhang, A., Lipton, Z. C., Li, M., & Smola, A. J. (2020). Dive into deep learning. Unpublished Draft. Available free Online

Course Requirements and Assignments

Lectures, discussions, assignments, and rubrics will be posted to the Canvas course management system. Links to additional materials will be provided in Canvas as well.

Summary of assignments and points earned:

  • Blog Entries/Discussion Forums - 17 points [CLOs 1,2,3,6]
    • Introduction - 2 point
    • 3 blog entries, 5 points each = 15 points
  • Assignments - Best 4 out of 5 assignments, 12 points each = 48 points [CLOs 1,2,3,5]
  • Group Project - 35 points [CLOs 1,2,3,4,5]
    • Groups and Idea - 5 points
    • Presentation - 10 points
    • Peer Evaluation - 5 points
    • Report - 15 points

The total number of points for this class is 100.

Assignments Due

Students are expected to check the course site several times each week. Unless otherwise noted, each module begins on Wednesday and ends on Tuesday. Assignments will be due by midnight (Pacific Time) on the due date. Contact the instructor prior to the due date in the case of serious illness or emergency.

  1. Lesson Dates: Most Lesson and Worktime periods begin on a Monday. Lesson materials are posted at least a week in advance.
  2. Due date Times are 11:59 pm Pacific time zone.
  3. Three Topical Discussions: Initial posts need to start early, and responses are also required. See details in the Discussion Instructions on the course site.*NOTE: For weeks with required discussion board postings, students should provide their initial post by Saturday at midnight (Pacific Time), to leave ample time for follow-up discussion. Please participate actively in the required discussions.

Late Policy

  • Late assignments will not be accepted after 5 days past the due date. Late assignments submitted after the assignment deadline will receive a 10% point reduction for each day up to 5 days based on the total point value of the assignment. For example, a 25-point assignment would have a daily 2.5-point reduction; a 15-point assignment would have a daily 1.5-point reduction; a 5-point assignment would have a daily 0.5-point reduction. No points will be awarded after 5 days late.
  • Discussion board postings will not be accepted for credit after the week's discussion has ended.
  • All course materials must be completed by the last day of the class.

Grading Information

The standard SJSU School of Information Grading Scale is utilized for all iSchool courses:

97 to 100A
94 to 96A minus
91 to 93B plus
88 to 90B
85 to 87B minus
82 to 84C plus
79 to 81C
76 to 78C minus
73 to 75D plus
70 to 72D
67 to 69D minus
Below 67F

 

In order to provide consistent guidelines for assessment for graduate level work in the School, these terms are applied to letter grades:

  • C represents Adequate work; a grade of "C" counts for credit for the course;
  • B represents Good work; a grade of "B" clearly meets the standards for graduate level work or undergraduate (for BS-ISDA);
    For core courses in the MLIS program (not MARA, Informatics, BS-ISDA) — INFO 200, INFO 202, INFO 204 — the iSchool requires that students earn a B in the course. If the grade is less than B (B- or lower) after the first attempt you will be placed on administrative probation. You must repeat the class if you wish to stay in the program. If - on the second attempt - you do not pass the class with a grade of B or better (not B- but B) you will be disqualified.
  • A represents Exceptional work; a grade of "A" will be assigned for outstanding work only.

Graduate Students are advised that it is their responsibility to maintain a 3.0 Grade Point Average (GPA). Undergraduates must maintain a 2.0 Grade Point Average (GPA).

University Policies

Per University Policy S16-9 (PDF), relevant university policy concerning all courses, such as student responsibilities, academic integrity, accommodations, dropping and adding, consent for recording of class, etc. and available student services (e.g. learning assistance, counseling, and other resources) are listed on the Syllabus Information web page. Make sure to visit this page to review and be aware of these university policies and resources.

Course Schedule

A detailed Course Calendar will be available in Canvas on the first day of the semester. The table below provides a summary of course topics and assignment due dates. It is subject to minor changes that will be announced with fair notice.

Course Overview

Week

Dates(1)

Lesson topic & readings(3)

Assignments / Activities(4)

Due-Date(2)

Points

1

 

Mar 13-19

1: Introduction to INFO 287-15

Lecture with recordings + selected readings.

·   Data Mining

·   Data Science

·   Machine Learning

Introductions

 

Blog 1 Posts: Why should we study machine learning? Reflect on how it will help you in your workplace/profession.

·       First post

·       Two responses

Mar 17 & 19

 

 

 

Mar 17

Mar 29

1 & 1

 

 

 

3

2

2

Mar 20-26

2: Ethics of Machine Learning

Lecture with recordings + selected readings.

·   Fairness, Accountability, Transparency and Explainability in ML

·   Steps to build fair AI

·   How to deal with unbalanced datasets

Group Project: Groups and Idea

·       Identify an interesting topic

·       Develop Research Questions on the topic

 

Blog 2 Posts: Why transparency and fairness are important in machine learning? Reflect on application scenarios.

·       First post

·       Two responses

Mar 26

 

 

 

 

 

Mar 24

Mar 26

5

 

 

 

 

 

3

2

 

Mar 27-31

Spring Break & Cesar Chavez Day

3

Mar 27-Apr 9

3: Clustering

Lecture with recordings + selected readings.

Assignment 1

Apr 9

12

4

Apr 10-16

4: Classification

Lecture with recordings + selected readings.

 

Assignment 2

 

Blog 3 Posts: Real world situations to differentiate between supervised and unsupervised learning scenarios.

·       First post

·       Two responses

Apr 16

 

 

 

Apr 14

Apr 16

12

 

 

 

3

2

5

Apr 17-23

5: Predictive Analysis

Lecture with recordings + selected readings.

Assignment 3

Apr 23

12

6

Apr 24-30

6: Working with Text

Lecture with recordings + selected readings.

9.1: Introduction to Deep Learning

Assignment 4

Apr 30

12

7

May 1-7

7: Analyzing YouTube Data

Lecture with recordings + selected readings.

·   Using Google API to collect data

·   Using ML to analyze data

9.2: Application of Deep Learning 1

Assignment 5

Group Project: Presentation

May 1

May 7

12

10

8

May 8-15

8: Analyzing Yelp Data

Lecture with recordings + selected readings.

·   Using Google API to collect data

·   Using ML to analyze data

9.3: Application of Deep Learning 2

Course Wrap Up

Group Project: Peer Evaluation

Group Project: Report

May 9

May 15

5

15

  1. Lesson Dates: Most Lesson and Worktime periods begin on a Monday. Lesson materials are posted at least a week in advance.
  2. Due-date Times are 11:59 pm Pacific time zone.
  3. Three Topical Discussions: Initial posts need to start early, and responses are also required. See details in the Discussion Instructions on the course site.

Late Policy

  • Late assignments will not be accepted after 5 days past the due date. Late assignments submitted after the assignment deadline will receive a 10%-point reduction for each day up to 5 days based on the total point value of the assignment. For example, a 25-point assignment would have a daily 2.5-point reduction; a 15-point assignment would have a daily 1.5-point reduction; a 5-point assignment would have a daily 0.5-point reduction. No points will be awarded after 5 days late.
  • Discussion board postings will not be accepted for credit after the week's discussion has ended.
  • All course materials must be completed by the last day of the class.