September 2023 Update: Get Out the Pumpkin

Community Tech Alliance
7 min readOct 2, 2023

--

Believe it or not, we’re just about one year out from GOTV season. Yes, we said it — sorry.

We’re not saying it to scare you (or ourselves!) but because the planning for a successful GOTV starts now. We’ve got runway — let’s make the most of it.

What can CTA build for you now to avoid scrambling in the future? Let us know in this short survey — we’re eager to help!

The 5 Common Data Problems

For years, we’ve heard the same challenge over and over again from partners: we’ve got a wealth of data but it’s hard to use it effectively.

As we’ve gotten more and more sophisticated about ways to capture data across the movement, it’s been hard to keep up with ways to handle that data — from storage and security, to integration and making it actionable.

Sound familiar? You’re certainly not alone. Here are some of the most common challenges we see — and how we can help solve them.

1. Data Data Everywhere: Orgs are collecting more and more data, and that’s great. But too often, that data isn’t talking to each other, decreasing the value of the data significantly. However, getting data sources to sync together and talk to one another can be labor-intensive and expensive.

Our solution: PAD, the Progressive Action Database, a data tool that makes it easy to integrate data from disparate sources. Whether you’re bringing together ad data from different platforms, voter file data, volunteer data, financial data, or more, PAD’s infrastructure can bring these sources together.

2. Wasted Time + Energy on Integration: Trying to solve the silo problem, data teams (and you might be one of them!) are spending hours downloading and uploading data files or engineering one-off solutions, eating up huge amounts of time that could be spent on valuable analysis.

Our solution: A baseline infrastructure (PAD) where you can sync and transform data. PAD can be customized to your organization’s needs in order to easily store, move, analyze, and visualize your data. PAD also syncs with over 50 tools (and counting). That’s 50+ integrations you don’t need to custom build.

3. Inactionable Data: With data siloed and hours sunk into building syncs, that doesn’t leave a lot of time for gleaning insights and making programmatic improvements. Nor do most organizations have a good way to visualize data in a way that’s accessible to folks outside of the data team.

Our solution: PAD’s integration with Looker Studio means orgs can better access, work with, and understand their data. (PAD also integrates with Tableau, Periscope, or other reporting tools.)

4. Data Security, Storage, and Cost: The cost and sprawl of managing all this disparate data is another headache and expense. And one that requires constant vigilance on security. That gets expensive quickly, and staying on-top of the latest security updates is time-intensive.

Our solution: By treating our data infrastructure like a public utility for our partners, we’ve been able to reduce the cost of data housing, syncs, and transformations by creating a singular, robust infrastructure. PAD also ensures uptime & reduces threat surface area.

5. Tool Sprawl + Degradation: With so many tools evolving (or disappearing) and new tools being introduced, it can feel impossible to stay up to date on what’s happening, meaning it’s hard to keep data current and safe.

Our solution: PAD is a shortcut because you’re building on top of an existing, solid, and tested data infrastructure that gets customized for your organization. No more worrying about the latest updates or tools — we take care of that for you.

If any of these sound like challenges your organization is tackling — you’re certainly not alone and we’re here to help. Send us a note to info@techallies.org and let’s get you some solutions!

Takeaways from the Google Cloud Next Conference

Last month, Huy, Emily, and Kelsey from our engineering team attended the Google Cloud Next conference in San Francisco. Their themes, takeaways, and tips are below (as well as pictures of Prog, obviously).

Q: What is something unexpected you learned at Google Cloud Next?
Huy: (Not so) fun fact: 95% of principals use less than 3% of their permissions! While it might be tempting to pile on permissions when you’re not sure what you need, keep in mind that users and service accounts are highly susceptible points of attack. Ensuring that your principals are endowed with only restricted access rights is pivotal to minimizing security breaches.

Emily: I was surprised by just how heavily Google emphasized Generative AI as a feature they are rolling out across a number of products. They really drove home the point that “GenAI is here, and it’s going to change everything, and you should use it.” We are still figuring out what role GenAI will play for us at CTA as we take the security of our data very seriously. However, it is clear it will be used widely in both the private and public sectors, and that’s a shift we will need to stay informed about. And some GenAI tools, like natural-language-to-SQL, could provide real value to our users, and that’s exciting!

Q: What is coming on the horizon that you were excited to share with your team?
Huy: Surprisingly, Google Meet with Duet AI! I’m a big multi-tasker for better or worse, so during meetings, I have a hard time not looking away for a minute to work on a small task or write a bit of code. Sometimes this can cause me to miss a key point/action item. Google is rolling out new features for Meet with Duet AI, including generating action items, mid-meeting summaries, and a chatbot that can talk about details discussed during an ongoing call.

Kelsey: I’m excited to see how tools utilizing natural language processing, like Duet AI and BigQuery Studio, may be able to fill any technical gaps for folks that we partner with so that they can get the most out of PAD and reduce the need for technical expertise in order to be productive. The idea of being able to use natural language directly within BQ to analyze data is exciting, and I can see that being a major time saver — even for folks who use SQL regularly but may not know the specific syntax they need to accomplish a task.

Q: What are some tips that you learned at the conference that people can use right now?
Huy: You can use the Security Command Center in Google Cloud to scan for vulnerabilities in your organization and detect misconfigurations that may be weakening your security! There’s a free tier to try, while the premium tier offers enhanced scanning, threat detection, and more.

Kelsey: Utilize labels in ETL/ELT jobs to organize resources and track costs throughout your project. Information about labels is forwarded to the billing system, so you can break down billed charges by label. This way resource usage can be easily attributed to specific teams, projects, partners, etc.

Emily: As Kelsey mentioned, there are a lot of tools that are already available for tracking costs, both in terms of analysis (running queries) and storage (maintaining data in BigQuery). The INFORMATION_SCHEMA views in BigQuery make it easy to estimate storage costs by looking at the size of tables, and you can also configure GCP to create a BigQuery table with logs from when users run queries, which makes it easy to track how much each user or project is spending on queries, and how much each individual query costs. How neat is that!

The team with Prog at the conference, catching some downtime and participating in trainings.

Notes from BenDesk*

*Ben is our resident ZenDesk captain and manager of all help@ inquiries. We’re bringing you interesting inquiries from his inbox each month to help share learnings across our community.

Question of the Month: Is it possible to connect PAD to Python? If so, how do I set this up and authenticate access?

Bendesk Answer: Yes, there is! A lot of people prefer using Python for data manipulation and interaction. BigQuery provides a variety of ways to effortlessly connect to your data using Python, including Google Colab (similar to Jupyter Notebook), Cloud Shell, and Google Cloud CLI. You can also use the Python client library and the Google Cloud CLI to connect to PAD from your local environment.

To get started, first install the Google Cloud CLI. Next, add gcloud CLI to your PATH by running the install.sh script. Once installed, follow the instructions on the page to initialize gcloud CLI and grant your computer access to your PAD project. It’s important to remember to log in and grant access using your CTA Google account instead of any personal or work emails.

Afterward, you can install the Python client library named “google-cloud-bigquery,” which you can use in your Python scripts. The library will automatically use your default gcloud configuration set up using the gcloud CLI. Once it’s successfully installed and authenticated, you can test if it works correctly by running the sample code below:

from google.cloud import bigquery
client = bigquery.Client()
QUERY = (
‘SELECT name FROM `bigquery-public-data.usa_names.usa_1910_2013` ‘
‘WHERE state = “TX” ‘
‘LIMIT 100’)
query_job = client.query(QUERY) # API request
rows = query_job.result() # Waits for query to finish
for row in rows:
print(row.name)

For more information on connecting PAD to Python, look at our help article here.

What We’re Reading

  • The Great Alone by Kristin Hannah: A classic coming-of-age story in almost all ways except that it takes place in the Alaskan wilderness.
  • Bird by Bird by Anne Lamott: If you’ve ever wanted to write a novel or short story, Anne Lamott’s book is a must-read. It is instructive while still being completely charming, funny, warm, and relatable. Honestly, it’s great even if you’re not planning on doing any writing.
  • Yellowface by R. F. Luang: A page-turner about, of all things, the publishing industry. This book explores the ideas of diversity and cultural appropriation and the cut-throat publishing world. Definitely one that you’ll want to discuss with friends.

--

--

Community Tech Alliance

Empowering the progressive community through smart data management.