Update Bigtable Programmatic Scaling Example [(googleapis#1003)](Goog…

…leCloudPlatform/python-docs-samples#1003) * Update Bigtable Programmatic Scaling Example * Rename "autoscaling" to "metricscaler" and the the term "programmatic scaling" * Remove `strategies.py` to simplify example * Fix wrong sleep length bug * Add maximum node count * hegemonic review
busunkim96 · Jun 27, 2017 · b7e42e5 · b7e42e5
commit b7e42e5
Show file tree

Hide file tree

Showing 5 changed files with 415 additions and 0 deletions.
diff --git a/samples/metricscaler/README.rst b/samples/metricscaler/README.rst
@@ -0,0 +1,132 @@
+.. This file is automatically generated. Do not edit this file directly.
+
+Google Cloud Bigtable Python Samples
+===============================================================================
+
+This directory contains samples for Google Cloud Bigtable. `Google Cloud Bigtable`_ is Google's NoSQL Big Data database service. It's the same database that powers many core Google services, including Search, Analytics, Maps, and Gmail.
+
+
+This sample demonstrates how to use `Stackdriver Monitoring`_
+to scale Cloud Bigtable based on CPU usage.
+
+.. _Stackdriver Monitoring: http://cloud.google.com/monitoring/docs/
+
+
+.. _Google Cloud Bigtable: https://cloud.google.com/bigtable/docs/ 
+
+Setup
+-------------------------------------------------------------------------------
+
+
+Authentication
+++++++++++++++
+
+Authentication is typically done through `Application Default Credentials`_,
+which means you do not have to change the code to authenticate as long as
+your environment has credentials. You have a few options for setting up
+authentication:
+
+#. When running locally, use the `Google Cloud SDK`_
+
+    .. code-block:: bash
+
+        gcloud auth application-default login
+
+
+#. When running on App Engine or Compute Engine, credentials are already
+   set-up. However, you may need to configure your Compute Engine instance
+   with `additional scopes`_.
+
+#. You can create a `Service Account key file`_. This file can be used to
+   authenticate to Google Cloud Platform services from any environment. To use
+   the file, set the ``GOOGLE_APPLICATION_CREDENTIALS`` environment variable to
+   the path to the key file, for example:
+
+    .. code-block:: bash
+
+        export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account.json
+
+.. _Application Default Credentials: https://cloud.google.com/docs/authentication#getting_credentials_for_server-centric_flow
+.. _additional scopes: https://cloud.google.com/compute/docs/authentication#using
+.. _Service Account key file: https://developers.google.com/identity/protocols/OAuth2ServiceAccount#creatinganaccount
+
+Install Dependencies
+++++++++++++++++++++
+
+#. Install `pip`_ and `virtualenv`_ if you do not already have them.
+
+#. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+.
+
+    .. code-block:: bash
+
+        $ virtualenv env
+        $ source env/bin/activate
+
+#. Install the dependencies needed to run the samples.
+
+    .. code-block:: bash
+
+        $ pip install -r requirements.txt
+
+.. _pip: https://pip.pypa.io/
+.. _virtualenv: https://virtualenv.pypa.io/
+
+Samples
+-------------------------------------------------------------------------------
+
+Metricscaling example
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+
+
+
+To run this sample:
+
+.. code-block:: bash
+
+    $ python metricscaler.py
+
+    usage: metricscaler.py [-h] [--high_cpu_threshold HIGH_CPU_THRESHOLD]
+                           [--low_cpu_threshold LOW_CPU_THRESHOLD]
+                           [--short_sleep SHORT_SLEEP] [--long_sleep LONG_SLEEP]
+                           bigtable_instance bigtable_cluster
+    
+    Scales Cloud Bigtable clusters based on CPU usage.
+    
+    positional arguments:
+      bigtable_instance     ID of the Cloud Bigtable instance to connect to.
+      bigtable_cluster      ID of the Cloud Bigtable cluster to connect to.
+    
+    optional arguments:
+      -h, --help            show this help message and exit
+      --high_cpu_threshold HIGH_CPU_THRESHOLD
+                            If Cloud Bigtable CPU usage is above this threshold,
+                            scale up
+      --low_cpu_threshold LOW_CPU_THRESHOLD
+                            If Cloud Bigtable CPU usage is below this threshold,
+                            scale down
+      --short_sleep SHORT_SLEEP
+                            How long to sleep in seconds between checking metrics
+                            after no scale operation
+      --long_sleep LONG_SLEEP
+                            How long to sleep in seconds between checking metrics
+                            after a scaling operation
+
+
+
+
+The client library
+-------------------------------------------------------------------------------
+
+This sample uses the `Google Cloud Client Library for Python`_.
+You can read the documentation for more details on API usage and use GitHub
+to `browse the source`_ and  `report issues`_.
+
+.. _Google Cloud Client Library for Python:
+    https://googlecloudplatform.github.io/google-cloud-python/
+.. _browse the source:
+    https://github.com/GoogleCloudPlatform/google-cloud-python
+.. _report issues:
+    https://github.com/GoogleCloudPlatform/google-cloud-python/issues
+
+
+.. _Google Cloud SDK: https://cloud.google.com/sdk/
diff --git a/samples/metricscaler/README.rst.in b/samples/metricscaler/README.rst.in
@@ -0,0 +1,27 @@
+# This file is used to generate README.rst
+
+product:
+  name: Google Cloud Bigtable
+  short_name: Cloud Bigtable
+  url: https://cloud.google.com/bigtable/docs/
+  description: >
+    `Google Cloud Bigtable`_ is Google's NoSQL Big Data database service. It's
+    the same database that powers many core Google services, including Search,
+    Analytics, Maps, and Gmail.
+
+description: |
+    This sample demonstrates how to use `Stackdriver Monitoring`_
+    to scale Cloud Bigtable based on CPU usage.
+
+    .. _Stackdriver Monitoring: http://cloud.google.com/monitoring/docs/
+
+setup:
+- auth
+- install_deps
+
+samples:
+- name: Metricscaling example
+  file: metricscaler.py
+  show_help: true
+
+cloud_client_library: true
diff --git a/samples/metricscaler/metricscaler.py b/samples/metricscaler/metricscaler.py
@@ -0,0 +1,165 @@
+# Copyright 2017 Google Inc.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""Sample that demonstrates how to use Stackdriver Monitoring metrics to
+programmatically scale a Google Cloud Bigtable cluster."""
+
+import argparse
+import time
+
+from google.cloud import bigtable
+from google.cloud import monitoring
+
+
+
+def get_cpu_load():
+    """Returns the most recent Cloud Bigtable CPU load measurement.
+
+    Returns:
+          float: The most recent Cloud Bigtable CPU usage metric
+    """
+    # [START bigtable_cpu]
+    client = monitoring.Client()
+    query = client.query('bigtable.googleapis.com/cluster/cpu_load', minutes=5)
+    time_series = list(query)
+    recent_time_series = time_series[0]
+    return recent_time_series.points[0].value
+    # [END bigtable_cpu]
+
+
+def scale_bigtable(bigtable_instance, bigtable_cluster, scale_up):
+    """Scales the number of Cloud Bigtable nodes up or down.
+
+    Edits the number of nodes in the Cloud Bigtable cluster to be increased
+    or decreased, depending on the `scale_up` boolean argument. Currently
+    the `incremental` strategy from `strategies.py` is used.
+
+
+    Args:
+           bigtable_instance (str): Cloud Bigtable instance ID to scale
+           bigtable_cluster (str): Cloud Bigtable cluster ID to scale
+           scale_up (bool): If true, scale up, otherwise scale down
+    """
+    _MIN_NODE_COUNT = 3
+    """
+    The minimum number of nodes to use. The default minimum is 3. If you have a
+    lot of data, the rule of thumb is to not go below 2.5 TB per node for SSD
+    clusters, and 8 TB for HDD. The bigtable.googleapis.com/disk/bytes_used
+    metric is useful in figuring out the minimum number of nodes.
+    """
+
+    _MAX_NODE_COUNT = 30
+    """
+    The maximum number of nodes to use. The default maximum is 30 nodes per zone.
+    If you need more quota, you can request more by following the instructions
+    <a href="https://cloud.google.com/bigtable/quota">here</a>.
+    """
+
+    _SIZE_CHANGE_STEP = 3
+    """The number of nodes to change the cluster by."""
+    # [START bigtable_scale]
+    bigtable_client = bigtable.Client(admin=True)
+    instance = bigtable_client.instance(bigtable_instance)
+    instance.reload()
+
+    cluster = instance.cluster(bigtable_cluster)
+    cluster.reload()
+
+    current_node_count = cluster.serve_nodes
+
+    if scale_up:
+        if current_node_count < _MAX_NODE_COUNT:
+            new_node_count = min(current_node_count + 3, _MAX_NODE_COUNT)
+            cluster.serve_nodes = new_node_count
+            cluster.update()
+            print('Scaled up from {} to {} nodes.'.format(
+                current_node_count, new_node_count))
+    else:
+        if current_node_count > _MIN_NODE_COUNT:
+            new_node_count = max(
+                current_node_count - _SIZE_CHANGE_STEP, _MIN_NODE_COUNT)
+            cluster.serve_nodes = new_node_count
+            cluster.update()
+            print('Scaled down from {} to {} nodes.'.format(
+                current_node_count, new_node_count))
+    # [END bigtable_scale]
+
+
+def main(
+        bigtable_instance,
+        bigtable_cluster,
+        high_cpu_threshold,
+        low_cpu_threshold,
+        short_sleep,
+        long_sleep):
+    """Main loop runner that autoscales Cloud Bigtable.
+
+    Args:
+          bigtable_instance (str): Cloud Bigtable instance ID to autoscale
+          high_cpu_threshold (float): If CPU is higher than this, scale up.
+          low_cpu_threshold (float): If CPU is lower than this, scale down.
+          short_sleep (int): How long to sleep after no operation
+          long_sleep (int): How long to sleep after the number of nodes is
+                            changed
+    """
+    cluster_cpu = get_cpu_load()
+    print('Detected cpu of {}'.format(cluster_cpu))
+    if cluster_cpu > high_cpu_threshold:
+        scale_bigtable(bigtable_instance, bigtable_cluster, True)
+        time.sleep(long_sleep)
+    elif cluster_cpu < low_cpu_threshold:
+        scale_bigtable(bigtable_instance, bigtable_cluster, False)
+        time.sleep(long_sleep)
+    else:
+        print('CPU within threshold, sleeping.')
+        time.sleep(short_sleep)
+
+
+if __name__ == '__main__':
+    parser = argparse.ArgumentParser(
+        description='Scales Cloud Bigtable clusters based on CPU usage.')
+    parser.add_argument(
+        'bigtable_instance',
+        help='ID of the Cloud Bigtable instance to connect to.')
+    parser.add_argument(
+        'bigtable_cluster',
+        help='ID of the Cloud Bigtable cluster to connect to.')
+    parser.add_argument(
+        '--high_cpu_threshold',
+        help='If Cloud Bigtable CPU usage is above this threshold, scale up',
+        default=0.6)
+    parser.add_argument(
+        '--low_cpu_threshold',
+        help='If Cloud Bigtable CPU usage is below this threshold, scale down',
+        default=0.2)
+    parser.add_argument(
+        '--short_sleep',
+        help='How long to sleep in seconds between checking metrics after no '
+             'scale operation',
+        default=60)
+    parser.add_argument(
+        '--long_sleep',
+        help='How long to sleep in seconds between checking metrics after a '
+             'scaling operation',
+        default=60 * 10)
+    args = parser.parse_args()
+
+    while True:
+        main(
+            args.bigtable_instance,
+            args.bigtable_cluster,
+            float(args.high_cpu_threshold),
+            float(args.low_cpu_threshold),
+            int(args.short_sleep),
+            int(args.long_sleep))