Neighborhood Analysis
  • Home
  • Syllabus
  • Schedule
  • Assignments
  • How To
  • Resources

Building a Data Pipeline

  • Schedule Overview
    • Course Schedule
  • Course Introduction
    • 1. Course Introduction
    • 2. What is a Neighborhood?
    • 3. Building a Data Pipeline
    • 4. Sharing Your Work
    • 5. Learner’s Permit
    • 6. Describing Places
    • 7. Describing Places
  • Strategies for Analysis
    • 8. Population and the Census
    • 9. Population and the Census
    • 10. Population Projections
    • 11. Population Projections
    • 12. Segregation
    • 13. Segregation
    • 14. Neighborhood Change
    • 15. Neighborhood Change
    • 16. Place Opportunity
    • 17. Place Opportunity
    • 18. TBD
    • 19. TBD
    • 20. TBD
    • 21. TBD
    • 22. Field Observation
    • 23. Field Observation
  • Course Wrap-Up
    • 24. Final Project Peer Review
    • 25. Final Presentations
    • 26. Independent Work and Advising
    • 27. Independent Work and Advising
    • 28. Final Presentations
    • 29. Final Presentations

On this page

  • Session Description
  • Before Class
  • Reflect
    • Workflows
    • Readings
  • Slides
  • Resources for Further Exploration

Building a Data Pipeline

Session Description

In this session, we’ll explore some of the basic workflow which we’ll use over the course of the semester to package and share analysis. We’ll develop familiarity with Quarto, and basic operations in Github so that you are able to share code and analysis over the course of the semester.

Lab 1 Link

Before Class

Ensure that your computer has the latest stable versions of R and RStudio installed.

Accept the GitHub invitation to our Lab 1 repository and download the repository to your local computer (we will set up more advanced tools for interacting with GitHub in our next lab session.

D’Ignazio, Catherine, and Lauren F. Klein. (2020). Data Feminism. MIT Press. Chapter 1 , Chapter 2

Reflect

Workflows

  • What are the types of common tasks in your workflows that you think would benefit from a data pipeline?

  • How do we hold ourselves accountable for our analysis?

Readings

  • Whose interests and goals do you seek to represent through your work?

  • How does Collins’ matrix of domination (structural, disciplinary, hegemonic, interpersonal) interact with acts of data-driven storytelling?

  • What missing datasets (akin to the Library of Missing Datasets) have you observed?1

  • What’s an analysis for which you’d like to reconstruct in ways that challenge the power manifested?

Slides

Resources for Further Exploration

Footnotes

  1. At the beginning of our session, we’ll catalog some of these datasets - it may help to write down some of your thoughts to share.↩︎

Content Andrew J. Greenlee
Made with and Quarto
Website Code on Github