We get used to reading about big things – high-level architectural solutions, design patterns in action, or big data processing. However, most of our problems are small and lie at the low-level in our code. One of them is: how to synchronize a collection of entities with another collection, which is its simplified representation?

Imagine you have a collection of entities. You receive an updated version of that collection, e.g., from the API endpoint. But what if data are transferred using DTOs? We should add new entities based on data in DTO, update existing entities, and remove elements that don’t appear in the edited collection.

In this article, I’ll show you a generalized solution to this particular problem.

Context

Before we start working with the actual solution, let’s look at the problem itself. Image an array containing structures that represent the product in the shopping cart.

$cart = [
    ["id" => 1, "name" => "Apple iPhone 6", "quantity" => 1],
    ["id" => 2, "name" => "Samsung Galaxy S6", "quantity" => 1],
    ["id" => 3, "name" => "USB C-HDMI Adapter", "quantity" => 1],
];

Because many things may change simultaneously, we don’t want to process each action separately. Just imagine a frontend app where you can modify the basket purely on the frontend side and send the updated version to the API1.

So, user sends updated version of the cart:

$cart = [
    ["id" => 1, "name" => "Apple iPhone 6", "quantity" => 2],
    ["id" => 3, "name" => "USB C-HDMI Adapter", "quantity" => 1],
    ["id" => 4, "name" => "HDD 1TB", "quantity" => 1],
];

What happened here?

  • User adds a new product to the cart, HDD 1TB,
  • User removes the product Samsung Galaxy S6 from the cart,
  • User changes the quantity of product Apple iPhone 6 from 1 to 2.

Now, you may ask why we ever consider it as a problem? After all, we have the updated version of the array right away. So, yes and no. Yes – because it’s an updated version of the previous collection. No – because it’s only a simplified version of the data model that lives on the API side.

The API may already have an entity called Cart with assigned Products with some additional data, like quantity in this case. That’s why, based on the updated $cart array, we need to update our underneath data model.

Basic solution

Forget about the generalization, and let’s develop a solution for this specific case. It may look as follows.

$cart = $this->getCart();
$newCart = $request->request->get("cart");

foreach ($cart->getProducts() as $product) {
    $updatedVersion = array_filter(
        $newCart,
        fn($newProduct) => $newProduct["id"] === $product->getId()
    );

    // remove Product from Cart model
    if (count($updatedVersion) === 0) {
        $cart->removeProduct($product);
        continue;
    }

    // update existing Product
    if (count($updatedVersion) === 1) {
        $index = array_key_first($updatedVersion);
        $product->setQuantity($updatedVersion["quantity"]);

        unset($newCart[$index]); // remove processed product
        continue;
    }
}

// add new elements to Cart model
foreach ($newCart as $newProduct) {
    $cart->addProduct(new ProductRef($newProduct["id"], $newProduct["quantity"]));
}

// flush changes into the database

We define each case (add, update, remove), so Cart is working as expected. Is anything we can do better? Indeed, implementation is a moot point, but we want to prepare a tool for handling similar cases without worrying about these lookups and loops.

Abstraction above synchronization of collections

The whole mechanism looks good, but to make it more generic, we need to replace specific operations with some abstracts:

  • How can we identify an updated version of the currently existing element?
  • How can we add a new element into the collection (including data model)?
  • How can we remove an element from the collection?
  • How can we update an element in the collection?

If we extract the behaviors mentioned above outside the mechanism, we should end up with a generic solution to synchronize collections, no matter what they contain.

Handling adding/updating/removing sounds like an excellent candidate to Policy, also known as Strategy pattern. The way how we can identify the corresponding element, we could provide as an argument during synchronization.

That’s an idea.

Synchronized collection

The recipe is simple: wrap your collection in a special object called SynchronizedCollection and implement SynchronizationPolicy. You can then use a method called sync to provide another array that collection has to synchronize with and instruction on finding a corresponding element.

In the below snippets, I omitted some parts of code for readability. If you look for the whole code, I published it as GitHub Gist.

SynchronizationPolicy.php:

interface SynchronizationPolicy
{
    /**
     * @return mixed added element
     */
    public function handleAdd($data);

    /**
     * @return mixed updated element
     */
    public function handleUpdate($origin, $data);
 
    public function handleRemove($origin);
}

SynchronizedCollection.php:

class SynchronizedCollection
{
    private array $collection;
    private SynchronizationPolicy $policy;
    
    /**
     * @param array $elements Array of elements to synchronize
     * @param callable $matcher Function that match element from collection to coresponding element
     * @return $this
     */
    public function sync(array $elements, Callable $matcher): self
    {
        $copiedCollection = $this->collection;

        foreach ($this->collection as $key => $origin) {
            // find updated version of origin in provided array
            $updatedVersion = array_filter($elements, fn($element) => $matcher($element, $origin));

            // if origin is not found, then handle removal
            if (count($updatedVersion) === 0) {
                $this->policy->handleRemove($origin);

                unset($copiedCollection[$key]);
                continue;
            }

            // if origin is found then handle update
            if (count($updatedVersion) === 1) {
                $index = array_key_first($updatedVersion);

                $copiedCollection[$key] = $this->policy->handleUpdate($origin, reset($updatedVersion));

                unset($elements[$index]);
                continue;
            }

            // if origin is matched against more than one element, then throw exception
            throw new AmbiguousElementException("Provided array contains ambiguous element.");
        }

        array_walk($elements, function ($data) use (&$copiedCollection) 
            $copiedCollection[] = $this->policy->handleAdd($data);
        });

        return new static(array_values($copiedCollection), $this->policy);
    }
}

SynchronizedCollection encapsulates the whole mechanism of synchronization. The policy is responsible for providing a new or updated version of entities. Under the hood, sync composes a new array and wrap it in another SynchronizedCollection. To retrieve the raw array from it, you can use e.g., toArray method.

As you can see, it doesn’t follow the standard collection interface. However, it shouldn’t be a problem to implement missing methods or to use this method to extend your collection.

Finding the corresponding element

Let’s look yet at how we approach finding the corresponding element.

$updatedVersion = array_filter($elements, fn($element) => $matcher($element, $origin));

This seemingly enigmatic piece of code uses function composition to let the user define a simple $matcher function instead of enforcing him to implement array_filter by himself. The matcher is a simple comparator and looks as follows:

function ($toSyncElement, $collectionElement): bool;

The first argument holds data to sync, whereas the second element is an actual element in the collection. Matcher should answer whether $toSyncElement concerns $collectionElement.

SynchronizedCollection in action

I refactor the previous code to use a SynchronizedCollection.

$cart = $this->getCart();
$newCart = $request->request->get("cart");

$policy = new class($cart) implements SynchronizationPolicy {
    private $cart;

    public function __construct($cart)
    {
        $this->cart = $cart;
    }

    public function handleAdd($data)
    {
        return $this->cart->addProduct(new ProductRef($data["id"], $data["quantity"]));
    }

    public function handleUpdate($origin, $data)
    {
        return $origin->setQuantity($data["quantity"]);
    }

    public function handleRemove($origin)
    {
        $this->cart->removeProduct($origin);
    }
}

$cartCollection = new SynchronizedCollection($cart, $policy);
$newCartCollection = $cartCollection->sync($newCart, fn($newProduct, $product) => $newProduct["id"] === $product->getId());

// flush changes into the database

If the definition of $policy looks too complicated for you, you can set up a dynamic policy where you can provide needed callbacks dynamically:

final class DynamicSynchronizationPolicy implements SynchronizationPolicy
{
    private $addCallback;
    private $updateCallback;
    private $removeCallback;

    public function __construct(Callable $addCallback, Callable $updateCallback, Callable $removeCallback)
    {
        $this->addCallback = $addCallback;
        $this->updateCallback = $updateCallback;
        $this->removeCallback = $removeCallback;
    }

    public function handleAdd($data)
    {
        return call_user_func($this->addCallback, $data);
    }

    public function handleUpdate($origin, $data)
    {
        return call_user_func($this->updateCallback, $origin, $data);
    }

    public function handleRemove($origin)
    {
        call_user_func($this->removeCallback, $origin);
    }
}

If you wonder why I marked this class as final, check my other article about how to use final keyword. Back to the topic, we can simplify our solution a bit more:

$cart = $this->getCart();
$newCart = $request->request->get("cart");

$policy = new DynamicSynchronizationPolicy(
    fn($data) => $cart->addProduct(new ProductRef($data["id"], $data["quantity"])),
    fn($origin, $data) => $origin->setQuantity($data["quantity"]),
    fn($origin) => $cart->removeProduct($origin)
};

$cartCollection = new SynchronizedCollection($cart, $policy);
$newCartCollection = $cartCollection->sync($newCart, fn($newProduct, $product) => $newProduct["id"] === $product->getId());

// flush changes into the database

Isn’t it simple?

Where we used it?

In one of our projects, we decided to use the Symfony Form component to handle incoming requests to the API and perform initial data validation. Instead of letting FormTypes filling entities directly, we utilized a DTO layer2. But the actual problem was related to one form type, to be precise, CollectionType.

It’s a topic for a different article, but – let’s admit that – the Symfony Form component was never designed to work as an API handling method. This use-case is possible because we use only a backend side of the forms. Still, they are some visible differences between handling POST forms from browsers and API calls. A good example is CollectionType.

The CollectionType is index-based. It means that the element from the request matches the corresponding element in the same position in the collection. If you remove an element from the array and send it to the API, you unintentionally place other objects in incorrect positions. Instead of removing that specific element, the component will update every entity/DTO based on its position and remove the last one with no equivalent in the original collection. That’s why we built SynchronizedCollection – to avoid that behavior.

It’s only a brief hint of this problem. Perhaps in the future, the Form component lets us define the indexes rather than use the element’s position every time.

Summary

Surprisingly, SynchronizedCollection was settled down in our application in different places in different variations. It hides a complex problem behind a simple interface, and I think that’s the reason why it’s so useful.

The whole code is available as GitHub Gist. Feel free to make comments, improvements, and so on. I don’t think that providing a composer package is something that our community needs, but it may change in the future.

If this article was useful for you, please let me know. Btw, I haven’t found a better title for it 😶.

Featured photo by John Mark Arnold on Unsplash.


  1. Our use case concerns collecting addresses, where the user could provide many addresses into his profile. The whole operation was contained in the update user endpoint. ↩︎

  2. At the moment of writing this article, Symfony’s documentation has opened PR regarding usage DTOs in form handling. ↩︎