HI + IM = Nulli

Nulli experts share their Human Information + Identity Management knowledge

Neo4j Causal Cluster on GKE

I was inspired by Mark Needham's post last week Kubernetes: Spinning up a Neo4j 3.1 Causal Cluster. I was going through this exact exercise myself and this really short circuited my learning curve. Mark left off that the next steps for him were to spin it up on both Google Container Engine (GKE) and AWS. He also wanted to add READ_REPLICA nodes into the mix. I couldn't wait so I took what Mark had started with and built on it.

The CORE nodes can be separated from the EDGE nodes thus allowing each StatefulSet (PetSet) to be scaled in isolation.

First you will need s Google Compute Platform account. Google makes this a really easy, low risk free proposition with the free trial right now. They provide new contestants with a $300 USD credit for a 60 day trial. Once you are signed up you will want to check out the Google Container Engine (GKE) quickstart guide. You will need to enable billing (involves a credit card) and create a project to proceed.

Using the Google Cloud Shell, you can get away without installing anything locally. But if you are really committed you can downaload and install the Cloud SDK.

The first thing to do is to create a new cluster in Google Container Engine (GKE). This can be done via the command line or via the UI.

The following is the GKE Create Cluster url. The UI is very self explanatory.

https://console.cloud.google.com/kubernetes/add

NOTE: Of importance in creating a new cluster is enabling Kubernetes alpha feautures. Click on more and then select the alpha features check box.

Before it is created you can optionally click on command line at the bottom of the page to get the equivalent command line version of the UI options that you just selected

If you woudl rather create it via the command line and don't want to go throug the UI at all, creating a new container cluster via the CLI can be done with something like the following.

gcloud container clusters create "my-new-cluster" \
  --project "my-project-id"
  --zone "us-central1-c" \
  --machine-type "n1-standard-1" \
  --image-type "GCI" \
  --disk-size "100" \
  --scopes "https://www.googleapis.com/auth/compute",
           "https://www.googleapis.com/auth/devstorage.read_only",
           "https://www.googleapis.com/auth/logging.write",
           "https://www.googleapis.com/auth/servicecontrol",
           "https://www.googleapis.com/auth/service.management.readonly",
           "https://www.googleapis.com/auth/trace.append" \
  --num-nodes "3" \
  --network "default"


The project id needs to be replaced with your actual project id but other than that change this command will create a cluster that you can use to run your Kubernetes Neo4j containers.

Now that the cluster of instances is set up all that is left to do is deploy the containers. This can be accomplished with the exact command Mark used and using Mark's template.

kubectl create -f neo4j.yaml

This will bring up the same Neo4j Causal Core Cluster that Mark had in his example.

Now, say we want to add a few READ_REPLICA to the causal cluster. How do we achieve that? I started with Mark's template and I added a new StatefulSet (née PetSet) section for the read reaplicas. After some trial and error I came up with the following template

# Headless service to provide DNS lookup
apiVersion: v1
kind: Service
metadata:
  labels:
    app: neo4j
  name: neo4j
spec:
  clusterIP: None
  ports:
  - port: 7474
    name: browser
  - port: 7687
    name: bolt
  selector:
    app: neo4j
---
# new API name
apiVersion: "apps/v1alpha1"
kind: PetSet
metadata:
  name: neo4j-core
spec:
  serviceName: neo4j
  replicas: 2
  template:
    metadata:
      annotations:
        pod.alpha.kubernetes.io/initialized: "true"
        pod.beta.kubernetes.io/init-containers: '[
            {
                "name": "install",
                "image": "gcr.io/google_containers/busybox:1.24",
                "command": ["/bin/sh", "-c", "echo \"
                unsupported.dbms.edition=enterprise\n
                dbms.mode=CORE\n
                dbms.connectors.default_advertised_address=$HOSTNAME.neo4j.default.svc.cluster.local\n
                dbms.connectors.default_listen_address=0.0.0.0\n
                dbms.connector.bolt.type=BOLT\n
                dbms.connector.bolt.enabled=true\n
                dbms.connector.bolt.listen_address=0.0.0.0:7687\n
                dbms.connector.http.type=HTTP\n
                dbms.connector.http.enabled=true\n
                dbms.connector.http.listen_address=0.0.0.0:7474\n
                causal_clustering.raft_messages_log_enable=true\n
                causal_clustering.initial_discovery_members=neo4j-core-0.neo4j.default.svc.cluster.local:5000,neo4j-core-1.neo4j.default.svc.cluster.local:5000,neo4j-core-2.neo4j.default.svc.cluster.local:5000\n
                causal_clustering.leader_election_timeout=2s\n
                  \" > /work-dir/neo4j.conf" ],
                "volumeMounts": [
                    {
                        "name": "confdir",
                        "mountPath": "/work-dir"
                    }
                ]
            }
        ]'
      labels:
        app: neo4j
        role: core
    spec:
      containers:
      - name: neo4j
        image: "gcr.io/my-project-id/neo4j-experimental:3.1.0-M13-beta3-enterprise"
        imagePullPolicy: Always
        ports:
        - containerPort: 5000
          name: discovery
        - containerPort: 6000
          name: tx
        - containerPort: 7000
          name: raft
        - containerPort: 7474
          name: browser
        - containerPort: 7687
          name: bolt
        securityContext:
          privileged: true
        volumeMounts:
        - name: datadir
          mountPath: /data
        - name: confdir
          mountPath: /conf
      volumes:
      - name: confdir
  volumeClaimTemplates:
  - metadata:
      name: datadir
      annotations:
        volume.alpha.kubernetes.io/storage-class: anything
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi
---
# new API name
apiVersion: "apps/v1alpha1"
kind: PetSet
metadata:
  name: neo4j-edge
spec:
  serviceName: neo4j
  replicas: 2
  template:
    metadata:
      annotations:
        pod.alpha.kubernetes.io/initialized: "true"
        pod.beta.kubernetes.io/init-containers: '[
            {
                "name": "install",
                "image": "gcr.io/google_containers/busybox:1.24",
                "command": ["/bin/sh", "-c", "echo \"
                unsupported.dbms.edition=enterprise\n
                dbms.mode=READ_REPLICA\n
                dbms.connectors.default_advertised_address=$HOSTNAME.neo4j.default.svc.cluster.local\n
                dbms.connectors.default_listen_address=0.0.0.0\n
                dbms.connector.bolt.type=BOLT\n
                dbms.connector.bolt.enabled=true\n
                dbms.connector.bolt.listen_address=0.0.0.0:7687\n
                dbms.connector.http.type=HTTP\n
                dbms.connector.http.enabled=true\n
                dbms.connector.http.listen_address=0.0.0.0:7474\n
                causal_clustering.raft_messages_log_enable=true\n
                causal_clustering.initial_discovery_members=neo4j-core-0.neo4j.default.svc.cluster.local:5000,neo4j-core-1.neo4j.default.svc.cluster.local:5000,neo4j-core-2.neo4j.default.svc.cluster.local:5000\n
                causal_clustering.leader_election_timeout=2s\n
                  \" > /work-dir/neo4j.conf" ],
                "volumeMounts": [
                    {
                        "name": "confdir",
                        "mountPath": "/work-dir"
                    }
                ]
            }
        ]'
      labels:
        app: neo4j
        role: edge
    spec:
      containers:
      - name: neo4j
        image: "gcr.io/my-project-id/neo4j-experimental:3.1.0-M13-beta3-enterprise"
        imagePullPolicy: Always
        ports:
        - containerPort: 5000
          name: discovery
        - containerPort: 6000
          name: tx
        - containerPort: 7000
          name: raft
        - containerPort: 7474
          name: browser
        - containerPort: 7687
          name: bolt
        securityContext:
          privileged: true
        volumeMounts:
        - name: datadir
          mountPath: /data
        - name: confdir
          mountPath: /conf
      volumes:
      - name: confdir
  volumeClaimTemplates:
  - metadata:
      name: datadir
      annotations:
        volume.alpha.kubernetes.io/storage-class: anything
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

I had to play around with the labels a bit to get the networking to work forwards and backwards between the two StatefulSets. I still need to understand this a little better but I ended up with two labels for each: "app" and "role". Having a different "app" yielded an incomplete solution but if the "app" was the same and the "role" was ommitted it only worked partially too.

Now the cluster can be scaled horizontally with different StatefulSets (PetSets). The CORE nodes can be separated from the EDGE nodes thus allowing each StatefulSet (PetSet) to be scaled in isolation from the other(s). The following command scales the CORE PetSet.

scale petset neo4j-core --replicas=3

While this one scales the EDGE petset.

scale petset neo4j-edge --replicas=5

The resulting scaled causal cluster can be easily viewed in the Neo4j browser using the kubectl port-forward.

 

 

 

 


If you have any questions or feedback about this article, please contact us.

Disclaimer: This information is provided "AS IS" without warranty of any kind, either expressed or implied. The entire risk as to the quality and performance of the information is with you.