Python Microservices in Action: Building a Highly Available E-commerce Payment System from Scratch-Blue Lotus Scripts

Origin

Have you ever been frustrated with a massive monolithic application? The codebase keeps growing, and you have to be extremely careful when modifying even a small feature, fearing it might affect other modules. As a Python developer, I deeply relate to this. Today, I'll share my insights from refactoring an e-commerce system and show you how to build a flexible and reliable microservices architecture using Python.

Pain Points

Last year, I took over a traditional e-commerce system - a massive Django application. As business rapidly grew, this monolithic application became increasingly bloated. Every new feature deployment required redeploying the entire system, and bugs in one module frequently caused system-wide failures.

Worse still, different business modules were tightly coupled. For instance, the payment module had to call interfaces from both the order and user modules, making the code difficult to maintain. Our development teams were also bound together, requiring multi-team coordination for any changes, severely impacting development efficiency.

Have you encountered similar challenges?

The Turning Point

At the end of last year, we decided to transform the system into microservices. We chose Python as our primary development language because:

First, Python's ecosystem is incredibly rich. Frameworks like Flask and Django REST framework provide excellent support for microservices development. For example, Flask's built-in Blueprint functionality is naturally suited for route distribution in microservices architecture.

Second, Python's syntax is concise and elegant, offering high development efficiency. In a microservices architecture, we need to write numerous API interfaces, and Python code's readability and maintainability significantly surpass other languages.

Finally, Python has many ready-to-use middleware options. There are comprehensive Python client libraries for RabbitMQ, Redis, MongoDB, and others.

Implementation

We ultimately split the system into several core microservices:

User service handles user management and authentication:

from flask import Flask, jsonify
from flask_jwt_extended import JWTManager, create_access_token

app = Flask(__name__)
app.config['JWT_SECRET_KEY'] = 'your-secret-key'
jwt = JWTManager(app)

@app.route('/auth/login', methods=['POST'])
def login():
    # Verify user credentials
    access_token = create_access_token(identity=user_id)
    return jsonify({'token': access_token})

@app.route('/users/<user_id>')
def get_user(user_id):
    # Get user information
    return jsonify(user_info)

Product service manages product information:

from flask import Flask, jsonify
from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)
db = SQLAlchemy(app)

class Product(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(100))
    price = db.Column(db.Float)
    stock = db.Column(db.Integer)

@app.route('/products')
def list_products():
    products = Product.query.all()
    return jsonify([{
        'id': p.id,
        'name': p.name,
        'price': p.price,
        'stock': p.stock
    } for p in products])

Order service handles order-related business:

from flask import Flask, request
import requests
from redis import Redis

app = Flask(__name__)
redis = Redis(host='localhost', port=6379)

@app.route('/orders', methods=['POST'])
def create_order():
    order_data = request.json

    # Check product inventory
    product_service_url = 'http://product-service/products/'
    product = requests.get(f"{product_service_url}{order_data['product_id']}").json()

    if product['stock'] < order_data['quantity']:
        return jsonify({'error': 'Out of stock'}), 400

    # Create order
    order_id = generate_order_id()
    redis.set(f"order:{order_id}", json.dumps(order_data))

    # Send message to payment service
    publish_to_payment_service(order_id, order_data)

    return jsonify({'order_id': order_id})

Payment service handles payment processing:

from flask import Flask
from celery import Celery
import requests

app = Flask(__name__)
celery = Celery('payment_service', broker='redis://localhost:6379/0')

@app.route('/payments', methods=['POST'])
def process_payment():
    payment_data = request.json

    # Process payment asynchronously
    process_payment_task.delay(payment_data)

    return jsonify({'status': 'Processing'})

@celery.task
def process_payment_task(payment_data):
    # Call payment gateway
    gateway_response = payment_gateway.process(payment_data)

    if gateway_response.is_successful:
        # Update order status
        update_order_status(payment_data['order_id'], 'paid')

        # Send notification
        notify_user(payment_data['user_id'], 'Payment successful')

This microservices architecture brought many benefits:

Independent Deployment Each microservice is an independent Python application that can be deployed and scaled separately. For example, during promotions, we can scale only the product and order services without touching other services.
Fault Isolation During last year's Singles' Day sale, thanks to the microservices architecture, even when the payment service experienced brief downtime, users could still browse products and add items to their cart without affecting the entire system.
Flexible Technology Stack Different services can use different technology stacks. For example, the user service uses MongoDB to store user data, while the order service uses PostgreSQL for order information. Each team can choose the most suitable technology.
Team Autonomy Now each microservice is managed by a dedicated team that can develop and release independently according to their own pace, without waiting for other teams.

Of course, microservices architecture also brought some challenges:

Service Communication Services need to communicate via HTTP API or message queues. We use RabbitMQ as our message middleware for handling asynchronous communication between services:

import pika

connection = pika.BlockingConnection(
    pika.ConnectionParameters('localhost'))
channel = connection.channel()


channel.queue_declare(queue='order_created')


channel.basic_publish(
    exchange='',
    routing_key='order_created',
    body='Order ID: 12345')

Data Consistency Maintaining data consistency in a distributed system is challenging. We adopted an eventual consistency approach, using message queues and compensation mechanisms to ensure data eventually becomes consistent:

from flask import Flask
from redis import Redis
import json

app = Flask(__name__)
redis = Redis(host='localhost', port=6379)

def compensate_failed_order(order_id):
    # Get order information
    order_data = json.loads(redis.get(f"order:{order_id}"))

    # Restore inventory
    restore_product_stock(order_data['product_id'], order_data['quantity'])

    # Process refund
    refund_payment(order_data['payment_id'])

    # Update order status
    update_order_status(order_id, 'cancelled')

Monitoring and Tracing To identify issues promptly, we use Prometheus and Grafana to monitor performance metrics of various services:

from prometheus_client import Counter, Histogram
from functools import wraps
import time


REQUEST_COUNT = Counter('request_count', 'App Request Count')
REQUEST_LATENCY = Histogram('request_latency_seconds', 'Request latency')

def track_metrics(f):
    @wraps(f)
    def decorated_function(*args, **kwargs):
        REQUEST_COUNT.inc()
        start_time = time.time()
        response = f(*args, **kwargs)
        REQUEST_LATENCY.observe(time.time() - start_time)
        return response
    return decorated_function

Lessons Learned

After more than a year of practice, I've summarized several key experiences:

Reasonable Service Boundary Division Don't create microservices just for the sake of it. Initially, we split the system too finely, resulting in frequent service calls that actually reduced performance. Later, we reorganized services based on business functions, keeping closely related functionalities together, which worked much better.
Asynchronous Communication is Important In microservices architecture, try to use asynchronous communication. For example, after an order is created, we notify the inventory and payment services asynchronously through message queues, improving system response speed and reliability.
Implement Proper Fault Tolerance In distributed systems, network failures are inevitable. Implement service degradation and circuit-breaking measures. We use the Circuit Breaker pattern for quick failure when services are unavailable:

from pybreaker import CircuitBreaker

breaker = CircuitBreaker(fail_max=5, reset_timeout=60)

@breaker
def call_product_service(product_id):
    try:
        response = requests.get(f"http://product-service/products/{product_id}")
        return response.json()
    except requests.exceptions.RequestException:
        # Fallback handling when service is unavailable
        return get_product_from_cache(product_id)

CI/CD is Important Each microservice needs its own build and deployment process. We use GitLab CI and Docker for automated deployment:

stages:
  - test
  - build
  - deploy

test:
  stage: test
  script:
    - pytest

build:
  stage: build
  script:
    - docker build -t $CI_REGISTRY_IMAGE .

deploy:
  stage: deploy
  script:
    - kubectl apply -f k8s/

Future Outlook

Microservices architecture isn't a silver bullet, but it has helped us solve many problems. If your team is considering microservices transformation, I suggest:

Start with the most independent business modules
Focus on infrastructure development, including service discovery, configuration center, monitoring systems, etc.
Establish good development standards and documentation mechanisms

Do you think microservices architecture suits your project? Feel free to share your thoughts and experiences in the comments.