Concurrency in Go with sync.WaitGroup

Concurrency in Go with sync.WaitGroup is a vital feature, allowing developers to efficiently handle multiple tasks at once. Two essential tools in Go's concurrency model are goroutines and the sync.WaitGroup struct. This post will explore sync.WaitGroup, explain what problems it solves, and walk through a real-world example with code.
Before diving in, it’s highly recommended to familiarize yourself with setting up a Go development environment.

What Are Goroutines?

Goroutines are lightweight, managed threads in Go, allowing functions to run concurrently by simply adding the go keyword before a function call.

To understand how goroutines work, let’s start with a basic Go application:

package main

import (
	"fmt"
)

func sayHello() {
	fmt.Println("Hello, World!")
}

func main() {
	sayHello()
}

Here, main() calls sayHello(), which prints "Hello, World!" and exits. Each line executes sequentially, one after the other.

Now, let’s modify this program by adding the go keyword before the sayHello() call:

package main

import (
	"fmt"
	"time"
)

func sayHello() {
	fmt.Println("Hello, World!")
}

func main() {
	go sayHello() // Running `sayHello` as a goroutine
}

In this version, sayHello() runs as a goroutine because we added go before calling the function. This allows the main program to continue without waiting for sayHello to finish. As a result it outputs nothing, i.e. no Hello World is displayed.

This brings us to the need for sync.WaitGroup to ensure synchronization.

What Problem Does sync.WaitGroup Solve?

When working with goroutines, it’s crucial to know when all goroutines have completed their work, especially if the main function or other parts of the program depend on their results. sync.WaitGroup helps solve this problem by:

  1. Tracking the number of goroutines that are running.
  2. Waiting for all goroutines to complete before proceeding.

Without WaitGroup, the main program may finish and exit before all goroutines complete their tasks, leading to incomplete operations or data inconsistencies.

How to Use sync.WaitGroup in Go

To use sync.WaitGroup, follow these steps:

  1. Declare a WaitGroup variable.
  2. Add to the WaitGroup counter for each goroutine that’s going to run.
  3. Call Done from each goroutine when it completes.
  4. Wait for all goroutines to finish by calling Wait() on the WaitGroup in the main function or wherever synchronization is needed.

Imagine we have a web scraping application that fetches data from multiple URLs concurrently. We want all URLs to be fetched before processing the data. Here’s how sync.WaitGroup can help manage this scenario:

package main

import (
	"fmt"
	"net/http"
	"sync"
	"time"
)

// fetchData fetches data from a URL and prints the status.
func fetchData(url string, wg *sync.WaitGroup) {
	defer wg.Done() // Decrement the WaitGroup counter when done

	resp, err := http.Get(url)
	if err != nil {
		fmt.Printf("Failed to fetch data from %s: %v\n", url, err)
		return
	}
	defer resp.Body.Close()

	fmt.Printf("Fetched data from %s with status code %d\n", url, resp.StatusCode)
}

func main() {
	// URLs to be fetched
	urls := []string{
		"https://example.com",
		"https://golang.org",
		"https://openai.com",
	}

	// Initialize the WaitGroup
	var wg sync.WaitGroup

	// Start a goroutine for each URL
	for _, url := range urls {
		wg.Add(1) // Increment the WaitGroup counter
		go fetchData(url, &wg)
	}

	// Wait for all goroutines to complete
	wg.Wait()
	fmt.Println("All URLs have been fetched.")
}
  1. Initialize WaitGroup: We create an instance of sync.WaitGroup named wg in main().
  2. Add to the WaitGroup Counter: For each URL, we increment the WaitGroup counter by calling wg.Add(1).
  3. Run Goroutines: We run fetchData as a goroutine for each URL, passing a pointer to wg.
  4. Signal Completion with Done(): Inside fetchData, wg.Done() is called with defer to ensure it runs when the function completes, even if an error occurs.
  5. Wait for Completion: Back in main(), we call wg.Wait() to block further execution until all URLs have been fetched.

Output

The output will look something like this (depending on network speed and response times):

Fetched data from https://example.com with status code 200
Fetched data from https://golang.org with status code 200
Fetched data from https://openai.com with status code 200
All URLs have been fetched.

Why sync.WaitGroup is Essential Here

Without WaitGroup, the program might exit immediately after starting the goroutines, leading to incomplete fetch operations. By using sync.WaitGroup, we ensure all URLs are processed before moving forward.

When to Use sync.WaitGroup

sync.WaitGroup is useful when:

  • Coordinating goroutines: You have multiple goroutines and need to ensure they all finish before continuing.
  • Waiting on multiple processes: You need to track the status of concurrent tasks and wait until they’re all completed.

In cases where there’s complex dependency management between tasks, consider other synchronization tools, such as channels or mutexes.

Conclusion

Go’s sync.WaitGroup is an invaluable tool for managing concurrency, especially when working with multiple goroutines. By tracking the status of each goroutine and waiting for them to complete, WaitGroup helps ensure that all tasks finish before proceeding, which is critical for data consistency and application stability. Without sync.WaitGroup, main routines might terminate too soon or leave resources in an inconsistent state. Using WaitGroup in applications that handle concurrent tasks brings more control, predictability, and reliability to your code, making it essential for effective parallelism in Go.

© 2024 Solution Toolkit . All rights reserved.