利用Service Worker与Rollup构建适配API网关的SAML透明认证层


一个棘手的现实是,在现代单页应用(SPA)中集成传统的SAML认证,用户体验往往是一场灾难。当后端会话过期时,任何后续的API请求都会被API网关拦截,强制执行一个全页面的重定向到身份提供商(IdP)。这个过程不仅粗暴地中断了用户操作,更直接摧毁了SPA当前的所有状态。对于一个复杂的、依赖大量客户端状态的B端应用来说,这种体验是不可接受的。

我们的目标很明确:在SAML会话过期后,实现一种对用户无感知的认证续期机制。API请求应该能够“暂停”,等待认证完成后自动重试,整个过程不发生页面跳转,应用状态完好无损。这里的核心挑战在于,浏览器主线程的JavaScript无法直接干预或捕获跨域重定向的认证流程。

初步构想是在客户端实现一个代理。这个代理必须能够拦截所有出站的API请求,检查响应,并在识别到认证失败时,独立于主应用线程发起并完成SAML认证流程。Service Worker是这个角色的不二之选。它就像一个植入在浏览器和网络之间的、可编程的网络代理。

sequenceDiagram
    participant SPA as SPA (Main Thread)
    participant SW as Service Worker
    participant APIGW as API Gateway (SAML SP)
    participant IdP as Identity Provider

    SPA->>SW: fetch('/api/data')
    SW->>APIGW: Forward fetch('/api/data')
    Note over APIGW: Session expired
    APIGW-->>SW: Respond with 401 Unauthorized or a special redirect header
    
    Note over SW: Intercepts 401. Halts response to SPA.
    SW->>SW: Initiate transparent auth flow.
    SW->>APIGW: Open hidden iframe to '/saml/login'
    APIGW->>IdP: Redirect iframe to IdP for authentication
    IdP-->>APIGW: User authenticates, POST SAML Assertion to ACS URL
    Note over APIGW: Validates assertion, creates new session cookie.
    APIGW-->>SW: Auth iframe signals success (e.g., postMessage)
    
    Note over SW: Auth complete. New session cookie is set.
    SW->>APIGW: Retry original fetch('/api/data')
    APIGW-->>SW: Respond with 200 OK and data
    SW-->>SPA: Return final successful response
    Note over SPA: Request completes successfully, slightly delayed. State is preserved.

这个架构中,各个组件职责清晰:

  • API Gateway: 作为SAML的服务提供商(SP),处理所有与IdP的交互,包括生成认证请求、解析SAML断言、管理用户会话。这是整个安全体系的后端支点。
  • Service Worker: 核心的客户端代理。它拦截fetch请求,识别认证失效的信号,并负责“在幕后”完成重认证流程。
  • Rollup: 构建工具。一个生产环境的Service Worker并非单个JS文件那么简单,它可能需要引入helper库、环境变量,并且必须被编译、压缩成一个高效、无依赖的脚本。Rollup专注于JavaScript模块打包,非常适合构建此类独立的worker脚本。

第一步:API网关作为SAML服务提供商

我们不会在这里从零实现一个SAML SP,但在真实项目中,通常会使用现成的库或网关插件。以下是一个基于Node.js Express的伪代码实现,用以阐明API网关的核心职责。

// file: gateway/saml-sp.js
// A conceptual implementation of the SAML SP logic on an API Gateway

const express = require('express');
const session = require('express-session');
const passport = require('passport');
const SamlStrategy = require('passport-saml').Strategy;
const fs =require('fs');

const IDP_ENTRY_POINT = 'https://idp.example.com/SAML2/SSO/POST';
const SP_ENTITY_ID = 'urn:example:sp';

// In a real project, this would be managed securely
const spCert = fs.readFileSync('./certs/sp.crt', 'utf-8');
const spKey = fs.readFileSync('./certs/sp.key', 'utf-8');

passport.use(new SamlStrategy(
  {
    path: '/login/callback', // Assertion Consumer Service (ACS) URL
    entryPoint: IDP_ENTRY_POINT,
    issuer: SP_ENTITY_ID,
    cert: fs.readFileSync('./certs/idp.crt', 'utf-8'), // IdP's public certificate
    privateKey: spKey,
    decryptionPvk: spKey,
    signatureAlgorithm: 'sha256',
  },
  (profile, done) => {
    // Here, find or create a user in your system based on the SAML profile
    // profile contains attributes like email, nameID, etc.
    console.log('SAML profile received:', profile);
    return done(null, { id: profile.nameID, email: profile.email });
  }
));

const app = express();

app.use(session({
  secret: 'a-very-secret-key-for-session',
  resave: false,
  saveUninitialized: true,
  cookie: { httpOnly: true, secure: true } // Production should be secure
}));

app.use(passport.initialize());
app.use(passport.session());

// Middleware to protect API routes
function ensureAuthenticated(req, res, next) {
  if (req.isAuthenticated()) {
    return next();
  }
  // Instead of a 302 redirect, we return 401 for XHR requests
  // This is the signal our Service Worker will listen for.
  res.status(401).json({ error: 'Session expired. Re-authentication required.' });
}

// The route that initiates the SAML login flow
app.get('/saml/login',
  passport.authenticate('saml', {
    successRedirect: '/saml/auth-success', // A page to signal success to the SW
    failureRedirect: '/saml/auth-failure'
  })
);

// The ACS endpoint where the IdP posts the SAML assertion
app.post('/login/callback',
  passport.authenticate('saml', {
    failureRedirect: '/saml/auth-failure',
    failureFlash: true
  }),
  (req, res) => {
    // On success, redirect to a specific page that can communicate with the SW
    res.redirect('/saml/auth-success');
  }
);

// A simple page that the auth iframe/popup loads on success.
// It will post a message to the Service Worker.
app.get('/saml/auth-success', (req, res) => {
    res.send(`
        <script>
            // This script runs in the hidden iframe
            // We need to signal the service worker that auth is complete.
            // Using BroadcastChannel is a robust way to communicate across browser contexts.
            const channel = new BroadcastChannel('auth-channel');
            channel.postMessage({ status: 'SUCCESS' });
            channel.close();
            // We can also close the popup window if that was the method used.
            window.close();
        </script>
        <h1>Authentication Successful</h1>
        <p>This window can be closed.</p>
    `);
});


// Protected API endpoint
app.get('/api/data', ensureAuthenticated, (req, res) => {
  res.json({
    user: req.user,
    sensitiveData: 'some-very-important-data'
  });
});

// Serve the main SPA
app.use(express.static('public'));

app.listen(3000, () => {
    console.log('API Gateway (SAML SP) listening on port 3000');
});

这里的关键点在于ensureAuthenticated中间件。对于API请求(通常是XHR),它不再返回302重定向,而是返回一个明确的401状态码。这是Service Worker采取行动的契约。

第二步:使用Rollup构建健壮的Service Worker

Service Worker的脚本运行在一个与主线程隔离的环境中,它有自己的全局作用域,不能访问DOM。因此,它的构建过程需要特别处理。

rollup.config.js的配置如下:

// file: rollup.config.js
import resolve from '@rollup/plugin-node-resolve';
import commonjs from '@rollup/plugin-commonjs';
import terser from '@rollup/plugin-terser';
import { babel } from '@rollup/plugin-babel';

const isProduction = process.env.NODE_ENV === 'production';

export default {
  input: 'src/sw.js',
  output: {
    file: 'public/sw.js', // Output to the public directory to be served
    format: 'es', // ES module format is suitable for modern browsers' SW
    sourcemap: !isProduction,
  },
  plugins: [
    resolve(), // Helps Rollup find external modules
    commonjs(), // Converts CommonJS modules to ES6
    babel({ // Transpile modern JS for wider browser support
        babelHelpers: 'bundled',
        presets: [['@babel/preset-env', {
            targets: {
                browsers: '> 0.5%, not dead'
            }
        }]]
    }),
    isProduction && terser(), // Minify the output in production
  ],
};

这个配置不复杂,但至关重要。它确保了我们的sw.js可以引入npm包(例如,用于管理重试逻辑的库),使用现代JavaScript语法,并最终被优化成一个单一、高效的脚本文件,准备好被浏览器注册。一个常见的错误是直接手写一个庞大的sw.js文件,这在项目变大后会变得难以维护。

第三步:Service Worker的核心拦截与认证逻辑

这是整个方案的核心。src/sw.js需要处理复杂的异步流程和状态管理。

// file: src/sw.js

const SW_VERSION = 'v1.0.0';
const AUTH_CHANNEL_NAME = 'auth-channel';
const AUTH_URL = '/saml/login';
const API_PREFIX = '/api/';

// A simple in-memory lock to prevent multiple auth flows at once.
let authInProgress = false;
let authPromise = null;

self.addEventListener('install', (event) => {
  console.log(`[SW ${SW_VERSION}] Installing...`);
  // Pre-caching assets can be done here, but for this example we focus on auth.
  // We use skipWaiting() to ensure the new SW activates immediately.
  event.waitUntil(self.skipWaiting());
});

self.addEventListener('activate', (event) => {
  console.log(`[SW ${SW_VERSION}] Activated.`);
  // This command ensures that the SW takes control of pages within its scope immediately.
  event.waitUntil(self.clients.claim());
});

self.addEventListener('fetch', (event) => {
  const { request } = event;

  // We only care about API requests.
  if (!request.url.includes(API_PREFIX)) {
    return;
  }

  // For navigation requests, do nothing special.
  if (request.mode === 'navigate') {
    return;
  }

  // Respond with a promise that resolves to the final response.
  // This allows us to "hold" the request while we perform async operations.
  event.respondWith(
    fetch(request.clone())
      .then(response => {
        // If the response is a 401, it's our signal to re-authenticate.
        if (response.status === 401) {
          console.log(`[SW ${SW_VERSION}] Detected 401. Initiating transparent auth flow.`);
          // We call a function that handles the entire auth process
          // and returns a promise that resolves with the retried request's response.
          return handleAuthAndRetry(request);
        }
        // If the response is not 401, just return it.
        return response;
      })
      .catch(error => {
        // Handle network errors
        console.error(`[SW ${SW_VERSION}] Network error during fetch:`, error);
        throw error;
      })
  );
});

/**
 * Manages the transparent SAML authentication flow and retries the original request.
 * @param {Request} request The original request that failed.
 * @returns {Promise<Response>} A promise that resolves with the response from the retried request.
 */
function handleAuthAndRetry(request) {
  // If another request has already started the auth process,
  // just wait for its promise to resolve and then retry.
  if (authInProgress) {
    console.log(`[SW ${SW_VERSION}] Auth already in progress. Waiting...`);
    return authPromise.then(() => fetch(request.clone()));
  }

  console.log(`[SW ${SW_VERSION}] Starting new auth flow.`);
  authInProgress = true;

  authPromise = new Promise((resolve, reject) => {
    const authChannel = new BroadcastChannel(AUTH_CHANNEL_NAME);

    authChannel.onmessage = (event) => {
      if (event.data && event.data.status === 'SUCCESS') {
        console.log(`[SW ${SW_VERSION}] Auth successful signal received.`);
        authChannel.close();
        resolve();
      } else {
        console.error(`[SW ${SW_VERSION}] Auth failed signal received.`);
        authChannel.close();
        reject(new Error('Authentication failed.'));
      }
    };
    
    // Open a popup window for the authentication. A hidden iframe can be more subtle
    // but popups are sometimes more reliable depending on browser policies.
    // The key is to open a new client context that can handle the redirects.
    self.clients.openWindow(AUTH_URL).then(client => {
      if (!client) {
        reject(new Error('Failed to open auth window. Pop-up might be blocked.'));
      }
    });

  }).finally(() => {
    // Reset the lock once the auth process is complete (either success or failure).
    authInProgress = false;
    authPromise = null;
    console.log(`[SW ${SW_VERSION}] Auth flow finished.`);
  });

  return authPromise.then(() => {
    console.log(`[SW ${SW_VERSION}] Retrying original request to ${request.url}`);
    return fetch(request.clone());
  });
}

代码中的几个关键设计:

  1. 请求拦截与响应判断: fetch事件监听器是核心。它只关心发往/api/的请求,并检查响应状态码是否为401。
  2. 异步流程控制: event.respondWith是Service Worker的魔法。它接受一个Promise,允许我们将异步操作(如等待认证完成)串联起来,再返回最终响应给主线程。
  3. 并发控制: 在真实场景中,多个API请求可能同时因会话过期而失败。authInProgress这个简单的锁机制至关重要,它能确保只启动一个认证流程。所有后续失败的请求都会等待这个流程完成,然后再重试。
  4. 跨上下文通信: Service Worker、主应用线程和认证弹出窗口是三个独立的浏览器上下文。BroadcastChannel是一种健壮的、用于在它们之间传递消息的机制。当认证成功后,认证窗口的页面通过BroadcastChannel广播一个成功消息,Service Worker监听到后即可继续执行。
  5. 认证窗口: self.clients.openWindow(AUTH_URL)会打开一个新的窗口或标签页来处理SAML重定向。相比隐藏的iframeopenWindow在处理跨域重定向和用户交互(如果需要输入密码)时限制更少,但可能会被弹出窗口拦截器阻止。这里的权衡需要根据具体业务场景来定。

第四步:主应用的集成

主应用需要做的很简单:在启动时注册Service Worker。

// file: public/app.js

if ('serviceWorker' in navigator) {
  window.addEventListener('load', () => {
    navigator.serviceWorker.register('/sw.js')
      .then(registration => {
        console.log('ServiceWorker registration successful with scope: ', registration.scope);
      })
      .catch(error => {
        console.log('ServiceWorker registration failed: ', error);
      });
  });
}

// Example function to fetch data
async function fetchData() {
    try {
        const response = await fetch('/api/data');
        if (!response.ok) {
            // This part will now likely never be reached for 401s
            // because the SW handles it. But good for other errors.
            throw new Error(`HTTP error! status: ${response.status}`);
        }
        const data = await response.json();
        console.log('Data received:', data);
        document.getElementById('content').innerText = JSON.stringify(data, null, 2);
    } catch (error) {
        console.error('Failed to fetch data:', error);
        document.getElementById('content').innerText = 'Failed to load data.';
    }
}

document.getElementById('fetch-btn').addEventListener('click', fetchData);

app.js的角度看,fetchData函数没有任何特殊之处。它就像正常调用API一样。所有的复杂性都被Service Worker优雅地封装了。这就是该架构最大的价值所在。

局限性与未来迭代方向

此方案虽然解决了核心痛点,但在生产环境中并非银弹。首先,它显著增加了前端的复杂性。Service Worker的生命周期、缓存策略和调试都需要专门的知识,一旦出错,排查难度远高于主线程代码。

其次,对弹出窗口的依赖可能带来问题。浏览器越来越严格的弹出窗口拦截策略可能导致认证流程无法启动。虽然可以引导用户手动允许,但这损害了“透明”这一核心目标。使用隐藏iframe可以规避此问题,但需要确保IdP和SP的页面都设置了正确的X-Frame-Options或CSP策略,在复杂的企业环境中这往往难以协调。

未来的一个优化方向可能是探索API网关与前端之间更紧密的协作。例如,API网关可以在会话即将过期时,通过一个特殊的HTTP头来“预警”Service Worker。Service Worker接收到信号后,可以在后台主动刷新会话,而不是等到请求失败后才被动响应,从而将认证延迟从用户体验中彻底消除。此外,随着Web Authentication API等新标准的成熟,未来可能会出现不依赖重定向的、更适合SPA的联邦认证方案。


  目录