Cloud Runner v0 - Reliable and trimmed down cloud runner (#353)

* Update cloud-runner-aws-pipeline.yml

* Update cloud-runner-k8s-pipeline.yml

* yarn build

* yarn build

* correct branch ref

* correct branch ref passed to target repo

* Create k8s-tests.yml

* Delete k8s-tests.yml

* correct branch ref passed to target repo

* correct branch ref passed to target repo

* Always describe AWS tasks for now, because unstable error handling

* Remove unused tree commands

* Use lfs guid sum

* Simple override cache push

* Simple override cache push and pull override to allow pure cloud storage driven caching

* Removal of early branch (breaks lfs caching)

* Remove unused tree commands

* Update action.yml

* Update action.yml

* Support cache and input override commands as input + full support custom hooks

* Increase k8s timeout

* replace filename being appended for unknclear reason

* cache key should not contain whitespaces

* Always try and deploy rook for k8s

* Apply k8s files for rook

* Update action.yml

* Apply k8s files for rook

* Apply k8s files for rook

* cache test and action description for kuber storage class

* Correct test and implement dependency health check and start

* GCP-secret run, cache key

* lfs smudge set explicit and undo explicit

* Run using external secret provider to speed up input

* Update cloud-runner-aws-pipeline.yml

* Add nodejs as build step dependency

* Add nodejs as build step dependency

* Cloud Runner Tests must be specified to capture logs from cloud runner for tests

* Cloud Runner Tests must be specified to capture logs from cloud runner for tests

* Refactor and cleanup - no async input, combined setup/build, removed github logs for cli runs

* Refactor and cleanup - no async input, combined setup/build, removed github logs for cli runs

* Refactor and cleanup - no async input, combined setup/build, removed github logs for cli runs

* Refactor and cleanup - no async input, combined setup/build, removed github logs for cli runs

* Refactor and cleanup - no async input, combined setup/build, removed github logs for cli runs

* better defaults for new inputs

* better defaults

* merge latest

* force build update

* use npm n to update node in unity builder

* use npm n to update node in unity builder

* use npm n to update node in unity builder

* correct new line

* quiet zipping

* quiet zipping

* default secrets for unity username and password

* default secrets for unity username and password

* ls active directory before lfs install

* Get cloud runner secrets from

* Get cloud runner secrets from

* Cleanup setup of default secrets

* Various fixes

* Cleanup setup of default secrets

* Various fixes

* Various fixes

* Various fixes

* Various fixes

* Various fixes

* Various fixes

* Various fixes

* Various fixes

* Various fixes

* Various fixes

* Various fixes

* Various fixes

* Various fixes

* Various fixes

* AWS secrets manager support

* less caching logs

* default k8s storage class to pd-standard

* more readable build commands

* Capture aws exit code 1 reliably

* Always replace /head from branch

* k8s default storage class to standard-rwo

* cleanup

* further cleanup input

* further cleanup input

* further cleanup input

* further cleanup input

* further cleanup input

* folder sizes to inspect caching

* dir command for local cloud runner test

* k8s wait for pending because pvc will not create earlier

* prefer k8s standard storage

* handle empty string as cloud runner cluster input

* local-system is now used for cloud runner test implementation AND correctly unset test CLI input

* local-system is now used for cloud runner test implementation AND correctly unset test CLI input

* fix unterminated quote

* fix unterminated quote

* do not share build parameters in tests - in cloud runner this will cause conflicts with resouces of the same name

* remove head and heads from branch prefix

* fix reversed caching direction of cache-push

* fixes

* fixes

* fixes

* cachePull cli

* fixes

* fixes

* fixes

* fixes

* fixes

* order cache test to be first

* order cache test to be first

* fixes

* populate cache key instead of using branch

* cleanup cli

* garbage-collect-aws cli can iterate over aws resources and cli scans all ts files

* import cli methods

* import cli files explicitly

* import cli files explicitly

* import cli files explicitly

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* import cli methods

* log parameters in cloud runner parameter test

* log parameters in cloud runner parameter test

* log parameters in cloud runner parameter test

* Cloud runner param test before caching because we have a fast local cache test now

* Using custom build path relative to repo root rather than project root

* aws-garbage-collect at end of pipeline

* aws-garbage-collect do not actually delete anything for now - just list

* remove some legacy du commands

* Update cloud-runner-aws-pipeline.yml

* log contents after cache pull and fix some scenarios with duplicate secrets

* log contents after cache pull and fix some scenarios with duplicate secrets

* log contents after cache pull and fix some scenarios with duplicate secrets

* PR comments

* Replace guid with uuid package

* use fileExists lambda instead of stat to check file exists in caching

* build failed results in core error message

* Delete sample.txt
This commit is contained in:
Frostebite
2022-04-11 00:00:37 +01:00
committed by GitHub
parent 2b399b2641
commit a61c02481f
65 changed files with 10262 additions and 2259 deletions
@@ -0,0 +1,201 @@
import * as k8s from '@kubernetes/client-node';
import { BuildParameters, Output } from '../../..';
import * as core from '@actions/core';
import { ProviderInterface } from '../provider-interface';
import CloudRunnerSecret from '../../services/cloud-runner-secret';
import KubernetesStorage from './kubernetes-storage';
import CloudRunnerEnvironmentVariable from '../../services/cloud-runner-environment-variable';
import KubernetesTaskRunner from './kubernetes-task-runner';
import KubernetesSecret from './kubernetes-secret';
import waitUntil from 'async-wait-until';
import KubernetesJobSpecFactory from './kubernetes-job-spec-factory';
import KubernetesServiceAccount from './kubernetes-service-account';
import CloudRunnerLogger from '../../services/cloud-runner-logger';
import { CoreV1Api } from '@kubernetes/client-node';
import DependencyOverrideService from '../../services/depdency-override-service';
class Kubernetes implements ProviderInterface {
private kubeConfig: k8s.KubeConfig;
private kubeClient: k8s.CoreV1Api;
private kubeClientBatch: k8s.BatchV1Api;
private buildGuid: string = '';
private buildParameters: BuildParameters;
private pvcName: string = '';
private secretName: string = '';
private jobName: string = '';
private namespace: string;
private podName: string = '';
private containerName: string = '';
private cleanupCronJobName: string = '';
private serviceAccountName: string = '';
constructor(buildParameters: BuildParameters) {
this.kubeConfig = new k8s.KubeConfig();
this.kubeConfig.loadFromDefault();
this.kubeClient = this.kubeConfig.makeApiClient(k8s.CoreV1Api);
this.kubeClientBatch = this.kubeConfig.makeApiClient(k8s.BatchV1Api);
CloudRunnerLogger.log('Loaded default Kubernetes configuration for this environment');
this.namespace = 'default';
this.buildParameters = buildParameters;
}
public async setup(
buildGuid: string,
buildParameters: BuildParameters,
// eslint-disable-next-line no-unused-vars
branchName: string,
// eslint-disable-next-line no-unused-vars
defaultSecretsArray: { ParameterKey: string; EnvironmentVariable: string; ParameterValue: string }[],
) {
try {
this.pvcName = `unity-builder-pvc-${buildGuid}`;
this.cleanupCronJobName = `unity-builder-cronjob-${buildGuid}`;
this.serviceAccountName = `service-account-${buildGuid}`;
if (await DependencyOverrideService.CheckHealth()) {
await DependencyOverrideService.TryStartDependencies();
}
await KubernetesStorage.createPersistentVolumeClaim(
buildParameters,
this.pvcName,
this.kubeClient,
this.namespace,
);
await KubernetesServiceAccount.createServiceAccount(this.serviceAccountName, this.namespace, this.kubeClient);
} catch (error) {
throw error;
}
}
async runTask(
buildGuid: string,
image: string,
commands: string,
mountdir: string,
workingdir: string,
environment: CloudRunnerEnvironmentVariable[],
secrets: CloudRunnerSecret[],
): Promise<string> {
try {
// setup
this.buildGuid = buildGuid;
this.secretName = `build-credentials-${buildGuid}`;
this.jobName = `unity-builder-job-${buildGuid}`;
this.containerName = `main`;
await KubernetesSecret.createSecret(secrets, this.secretName, this.namespace, this.kubeClient);
const jobSpec = KubernetesJobSpecFactory.getJobSpec(
commands,
image,
mountdir,
workingdir,
environment,
secrets,
this.buildGuid,
this.buildParameters,
this.secretName,
this.pvcName,
this.jobName,
k8s,
);
//run
const jobResult = await this.kubeClientBatch.createNamespacedJob(this.namespace, jobSpec);
CloudRunnerLogger.log(`Creating build job ${JSON.stringify(jobResult.body.metadata, undefined, 4)}`);
await new Promise((promise) => setTimeout(promise, 5000));
CloudRunnerLogger.log('Job created');
this.setPodNameAndContainerName(await Kubernetes.findPodFromJob(this.kubeClient, this.jobName, this.namespace));
CloudRunnerLogger.log('Watching pod until running');
let output = '';
// eslint-disable-next-line no-constant-condition
while (true) {
try {
await KubernetesTaskRunner.watchUntilPodRunning(this.kubeClient, this.podName, this.namespace);
CloudRunnerLogger.log('Pod running, streaming logs');
output = await KubernetesTaskRunner.runTask(
this.kubeConfig,
this.kubeClient,
this.jobName,
this.podName,
'main',
this.namespace,
CloudRunnerLogger.log,
);
break;
} catch (error: any) {
if (error.message.includes(`HTTP`)) {
continue;
} else {
throw error;
}
}
}
await this.cleanupTaskResources();
return output;
} catch (error) {
CloudRunnerLogger.log('Running job failed');
core.error(JSON.stringify(error, undefined, 4));
await this.cleanupTaskResources();
throw error;
}
}
setPodNameAndContainerName(pod: k8s.V1Pod) {
this.podName = pod.metadata?.name || '';
this.containerName = pod.status?.containerStatuses?.[0].name || '';
}
async cleanupTaskResources() {
CloudRunnerLogger.log('cleaning up');
try {
await this.kubeClientBatch.deleteNamespacedJob(this.jobName, this.namespace);
await this.kubeClient.deleteNamespacedPod(this.podName, this.namespace);
await this.kubeClient.deleteNamespacedSecret(this.secretName, this.namespace);
await new Promise((promise) => setTimeout(promise, 5000));
} catch (error) {
CloudRunnerLogger.log('Failed to cleanup, error:');
core.error(JSON.stringify(error, undefined, 4));
CloudRunnerLogger.log('Abandoning cleanup, build error:');
throw error;
}
try {
await waitUntil(
async () => {
const jobBody = (await this.kubeClientBatch.readNamespacedJob(this.jobName, this.namespace)).body;
const podBody = (await this.kubeClient.readNamespacedPod(this.podName, this.namespace)).body;
return (jobBody === null || jobBody.status?.active === 0) && podBody === null;
},
{
timeout: 500000,
intervalBetweenAttempts: 15000,
},
);
// eslint-disable-next-line no-empty
} catch {}
}
async cleanup(
buildGuid: string,
buildParameters: BuildParameters,
// eslint-disable-next-line no-unused-vars
branchName: string,
// eslint-disable-next-line no-unused-vars
defaultSecretsArray: { ParameterKey: string; EnvironmentVariable: string; ParameterValue: string }[],
) {
CloudRunnerLogger.log(`deleting PVC`);
await this.kubeClient.deleteNamespacedPersistentVolumeClaim(this.pvcName, this.namespace);
await Output.setBuildVersion(buildParameters.buildVersion);
// eslint-disable-next-line unicorn/no-process-exit
process.exit();
}
static async findPodFromJob(kubeClient: CoreV1Api, jobName: string, namespace: string) {
const namespacedPods = await kubeClient.listNamespacedPod(namespace);
const pod = namespacedPods.body.items.find((x) => x.metadata?.labels?.['job-name'] === jobName);
if (pod === undefined) {
throw new Error("pod with job-name label doesn't exist");
}
return pod;
}
}
export default Kubernetes;
@@ -0,0 +1,161 @@
import { V1EnvVar, V1EnvVarSource, V1SecretKeySelector } from '@kubernetes/client-node';
import BuildParameters from '../../../build-parameters';
import { CloudRunnerBuildCommandProcessor } from '../../services/cloud-runner-build-command-process';
import CloudRunnerEnvironmentVariable from '../../services/cloud-runner-environment-variable';
import CloudRunnerSecret from '../../services/cloud-runner-secret';
import CloudRunner from '../../cloud-runner';
class KubernetesJobSpecFactory {
static getJobSpec(
command: string,
image: string,
mountdir: string,
workingDirectory: string,
environment: CloudRunnerEnvironmentVariable[],
secrets: CloudRunnerSecret[],
buildGuid: string,
buildParameters: BuildParameters,
secretName,
pvcName,
jobName,
k8s,
) {
environment.push(
...[
{
name: 'GITHUB_SHA',
value: buildGuid,
},
{
name: 'GITHUB_WORKSPACE',
value: '/data/repo',
},
{
name: 'PROJECT_PATH',
value: buildParameters.projectPath,
},
{
name: 'BUILD_PATH',
value: buildParameters.buildPath,
},
{
name: 'BUILD_FILE',
value: buildParameters.buildFile,
},
{
name: 'BUILD_NAME',
value: buildParameters.buildName,
},
{
name: 'BUILD_METHOD',
value: buildParameters.buildMethod,
},
{
name: 'CUSTOM_PARAMETERS',
value: buildParameters.customParameters,
},
{
name: 'CHOWN_FILES_TO',
value: buildParameters.chownFilesTo,
},
{
name: 'BUILD_TARGET',
value: buildParameters.targetPlatform,
},
{
name: 'ANDROID_VERSION_CODE',
value: buildParameters.androidVersionCode.toString(),
},
{
name: 'ANDROID_KEYSTORE_NAME',
value: buildParameters.androidKeystoreName,
},
{
name: 'ANDROID_KEYALIAS_NAME',
value: buildParameters.androidKeyaliasName,
},
],
);
const job = new k8s.V1Job();
job.apiVersion = 'batch/v1';
job.kind = 'Job';
job.metadata = {
name: jobName,
labels: {
app: 'unity-builder',
buildGuid,
},
};
job.spec = {
backoffLimit: 0,
template: {
spec: {
volumes: [
{
name: 'build-mount',
persistentVolumeClaim: {
claimName: pvcName,
},
},
],
containers: [
{
name: 'main',
image,
command: ['/bin/sh'],
args: ['-c', CloudRunnerBuildCommandProcessor.ProcessCommands(command, CloudRunner.buildParameters)],
workingDir: `${workingDirectory}`,
resources: {
requests: {
memory: buildParameters.cloudRunnerMemory,
cpu: buildParameters.cloudRunnerCpu,
},
},
env: [
...environment.map((x) => {
const environmentVariable = new V1EnvVar();
environmentVariable.name = x.name;
environmentVariable.value = x.value;
return environmentVariable;
}),
...secrets.map((x) => {
const secret = new V1EnvVarSource();
secret.secretKeyRef = new V1SecretKeySelector();
secret.secretKeyRef.key = x.ParameterKey;
secret.secretKeyRef.name = secretName;
const environmentVariable = new V1EnvVar();
environmentVariable.name = x.EnvironmentVariable;
environmentVariable.valueFrom = secret;
return environmentVariable;
}),
],
volumeMounts: [
{
name: 'build-mount',
mountPath: `/${mountdir}`,
},
],
lifecycle: {
preStop: {
exec: {
command: [
'bin/bash',
'-c',
`cd /data/builder/action/steps;
chmod +x /return_license.sh;
/return_license.sh;`,
],
},
},
},
},
],
restartPolicy: 'Never',
},
},
};
return job;
}
}
export default KubernetesJobSpecFactory;
@@ -0,0 +1,28 @@
import { CoreV1Api } from '@kubernetes/client-node';
import CloudRunnerSecret from '../../services/cloud-runner-secret';
import * as k8s from '@kubernetes/client-node';
const base64 = require('base-64');
class KubernetesSecret {
static async createSecret(
secrets: CloudRunnerSecret[],
secretName: string,
namespace: string,
kubeClient: CoreV1Api,
) {
const secret = new k8s.V1Secret();
secret.apiVersion = 'v1';
secret.kind = 'Secret';
secret.type = 'Opaque';
secret.metadata = {
name: secretName,
};
secret.data = {};
for (const buildSecret of secrets) {
secret.data[buildSecret.ParameterKey] = base64.encode(buildSecret.ParameterValue);
}
return kubeClient.createNamespacedSecret(namespace, secret);
}
}
export default KubernetesSecret;
@@ -0,0 +1,17 @@
import { CoreV1Api } from '@kubernetes/client-node';
import * as k8s from '@kubernetes/client-node';
class KubernetesServiceAccount {
static async createServiceAccount(serviceAccountName: string, namespace: string, kubeClient: CoreV1Api) {
const serviceAccount = new k8s.V1ServiceAccount();
serviceAccount.apiVersion = 'v1';
serviceAccount.kind = 'ServiceAccount';
serviceAccount.metadata = {
name: serviceAccountName,
};
serviceAccount.automountServiceAccountToken = false;
return kubeClient.createNamespacedServiceAccount(namespace, serviceAccount);
}
}
export default KubernetesServiceAccount;
@@ -0,0 +1,116 @@
import waitUntil from 'async-wait-until';
import * as core from '@actions/core';
import * as k8s from '@kubernetes/client-node';
import BuildParameters from '../../../build-parameters';
import CloudRunnerLogger from '../../services/cloud-runner-logger';
import YAML from 'yaml';
class KubernetesStorage {
public static async createPersistentVolumeClaim(
buildParameters: BuildParameters,
pvcName: string,
kubeClient: k8s.CoreV1Api,
namespace: string,
) {
if (buildParameters.kubeVolume) {
CloudRunnerLogger.log(buildParameters.kubeVolume);
pvcName = buildParameters.kubeVolume;
return;
}
const pvcList = (await kubeClient.listNamespacedPersistentVolumeClaim(namespace)).body.items.map(
(x) => x.metadata?.name,
);
CloudRunnerLogger.log(`Current PVCs in namespace ${namespace}`);
CloudRunnerLogger.log(JSON.stringify(pvcList, undefined, 4));
if (pvcList.includes(pvcName)) {
CloudRunnerLogger.log(`pvc ${pvcName} already exists`);
if (!buildParameters.isCliMode) {
core.setOutput('volume', pvcName);
}
return;
}
CloudRunnerLogger.log(`Creating PVC ${pvcName} (does not exist)`);
const result = await KubernetesStorage.createPVC(pvcName, buildParameters, kubeClient, namespace);
await KubernetesStorage.handleResult(result, kubeClient, namespace, pvcName);
}
public static async getPVCPhase(kubeClient: k8s.CoreV1Api, name: string, namespace: string) {
try {
return (await kubeClient.readNamespacedPersistentVolumeClaim(name, namespace)).body.status?.phase;
} catch (error) {
core.error('Failed to get PVC phase');
core.error(JSON.stringify(error, undefined, 4));
throw error;
}
}
public static async watchUntilPVCNotPending(kubeClient: k8s.CoreV1Api, name: string, namespace: string) {
try {
CloudRunnerLogger.log(`watch Until PVC Not Pending ${name} ${namespace}`);
CloudRunnerLogger.log(`${await this.getPVCPhase(kubeClient, name, namespace)}`);
await waitUntil(
async () => {
return (await this.getPVCPhase(kubeClient, name, namespace)) === 'Pending';
},
{
timeout: 750000,
intervalBetweenAttempts: 15000,
},
);
} catch (error: any) {
core.error('Failed to watch PVC');
core.error(error.toString());
core.error(
`PVC Body: ${JSON.stringify(
(await kubeClient.readNamespacedPersistentVolumeClaim(name, namespace)).body,
undefined,
4,
)}`,
);
throw error;
}
}
private static async createPVC(
pvcName: string,
buildParameters: BuildParameters,
kubeClient: k8s.CoreV1Api,
namespace: string,
) {
const pvc = new k8s.V1PersistentVolumeClaim();
pvc.apiVersion = 'v1';
pvc.kind = 'PersistentVolumeClaim';
pvc.metadata = {
name: pvcName,
};
pvc.spec = {
accessModes: ['ReadWriteOnce'],
storageClassName: buildParameters.kubeStorageClass === '' ? 'standard' : buildParameters.kubeStorageClass,
resources: {
requests: {
storage: buildParameters.kubeVolumeSize,
},
},
};
if (process.env.K8s_STORAGE_PVC_SPEC) {
YAML.parse(process.env.K8s_STORAGE_PVC_SPEC);
}
const result = await kubeClient.createNamespacedPersistentVolumeClaim(namespace, pvc);
return result;
}
private static async handleResult(
result: { response: import('http').IncomingMessage; body: k8s.V1PersistentVolumeClaim },
kubeClient: k8s.CoreV1Api,
namespace: string,
pvcName: string,
) {
const name = result.body.metadata?.name || '';
CloudRunnerLogger.log(`PVC ${name} created`);
await this.watchUntilPVCNotPending(kubeClient, name, namespace);
CloudRunnerLogger.log(`PVC ${name} is ready and not pending`);
core.setOutput('volume', pvcName);
}
}
export default KubernetesStorage;
@@ -0,0 +1,104 @@
import { CoreV1Api, KubeConfig, Log } from '@kubernetes/client-node';
import { Writable } from 'stream';
import CloudRunnerLogger from '../../services/cloud-runner-logger';
import * as core from '@actions/core';
import { CloudRunnerStatics } from '../../cloud-runner-statics';
import waitUntil from 'async-wait-until';
import CloudRunner from '../../cloud-runner';
class KubernetesTaskRunner {
static async runTask(
kubeConfig: KubeConfig,
kubeClient: CoreV1Api,
jobName: string,
podName: string,
containerName: string,
namespace: string,
logCallback: any,
) {
CloudRunnerLogger.log(`Streaming logs from pod: ${podName} container: ${containerName} namespace: ${namespace}`);
const stream = new Writable();
let output = '';
let didStreamAnyLogs: boolean = false;
stream._write = (chunk, encoding, next) => {
didStreamAnyLogs = true;
let message = chunk.toString().trimRight(`\n`);
message = `[${CloudRunnerStatics.logPrefix}] ${message}`;
if (CloudRunner.buildParameters.cloudRunnerIntegrationTests) {
output += message;
}
logCallback(message);
next();
};
const logOptions = {
follow: true,
pretty: false,
previous: false,
};
try {
const resultError = await new Promise((resolve) =>
new Log(kubeConfig).log(namespace, podName, containerName, stream, resolve, logOptions),
);
stream.destroy();
if (resultError) {
throw resultError;
}
if (!didStreamAnyLogs) {
core.error('Failed to stream any logs, listing namespace events, check for an error with the container');
core.error(
JSON.stringify(
{
events: (await kubeClient.listNamespacedEvent(namespace)).body.items
.filter((x) => {
return x.involvedObject.name === podName || x.involvedObject.name === jobName;
})
.map((x) => {
return {
type: x.involvedObject.kind,
name: x.involvedObject.name,
message: x.message,
};
}),
},
undefined,
4,
),
);
throw new Error(`No logs streamed from k8s`);
}
} catch (error) {
if (stream) {
stream.destroy();
}
throw error;
}
CloudRunnerLogger.log('end of log stream');
return output;
}
static async watchUntilPodRunning(kubeClient: CoreV1Api, podName: string, namespace: string) {
let success: boolean = false;
CloudRunnerLogger.log(`Watching ${podName} ${namespace}`);
await waitUntil(
async () => {
const status = await kubeClient.readNamespacedPodStatus(podName, namespace);
const phase = status?.body.status?.phase;
success = phase === 'Running';
CloudRunnerLogger.log(
`${status.body.status?.phase} ${status.body.status?.conditions?.[0].reason || ''} ${
status.body.status?.conditions?.[0].message || ''
}`,
);
if (success || phase !== 'Pending') return true;
return false;
},
{
timeout: 2000000,
intervalBetweenAttempts: 15000,
},
);
return success;
}
}
export default KubernetesTaskRunner;